About 50 results
Open links in new tab
  1. pyspark - How to use AND or OR condition in when in Spark

    107 pyspark.sql.functions.when takes a Boolean Column as its condition. When using PySpark, it's often useful to think "Column Expression" when you read "Column". Logical operations on …

  2. pyspark - Adding a dataframe to an existing delta table throws …

    Jun 9, 2024 · Fix Issue was due to mismatched data types. Explicitly declaring schema type resolved the issue. schema = StructType([ StructField("_id", StringType(), True), …

  3. PySpark: multiple conditions in when clause - Stack Overflow

    Jun 8, 2016 · Very helpful observation when in pyspark multiple conditions can be built using & (for and) and | (for or). Note:In pyspark t is important to enclose every expressions within …

  4. pyspark

    Jan 2, 2023 · I am very new to pyspark and getting below error, even if drop all date related columns or selecting only one column. Date format stored in my data frame like "". …

  5. Show distinct column values in pyspark dataframe - Stack Overflow

    With pyspark dataframe, how do you do the equivalent of Pandas df['col'].unique(). I want to list out all the unique values in a pyspark dataframe column. Not the SQL type way …

  6. python - PySpark: "Exception: Java gateway process exited before ...

    I'm trying to run PySpark on my MacBook Air. When I try starting it up, I get the error: Exception: Java gateway process exited before sending the driver its port number when sc = …

  7. How to import pyspark.sql.functions all at once? - Stack Overflow

    Dec 23, 2021 · from pyspark.sql.functions import isnan, when, count, sum , etc... It is very tiresome adding all of it. Is there a way to import all of it at once?

  8. Pyspark: display a spark data frame in a table format

    Pyspark: display a spark data frame in a table format Asked 9 years, 3 months ago Modified 2 years, 4 months ago Viewed 413k times

  9. How to change dataframe column names in PySpark?

    I come from pandas background and am used to reading data from CSV files into a dataframe and then simply changing the column names to something useful using the simple command: …

  10. How to find the size or shape of a DataFrame in PySpark?

    Why doesn't Pyspark Dataframe simply store the shape values like pandas dataframe does with .shape? Having to call count seems incredibly resource-intensive for such a common and …