Consider the following code snippet:# Sample DataFrame (assuming it exists) df = spark.createDataFrame(...)
# Attempt to add a new column with a case-when expression (fix the error) df = df.withColumn("category", F.when(df["price"] ] 100, "Expensive").otherwise("Cheap")) df.show() What is the error in this code, and how can it be fixed?
What are the potential trade-offs to consider when using checkpointing in Spark applications?
You notice degraded read performance on an Iceberg table after many updates and deletes. What maintenance task should you perform to improve this?
Which of the following strategies would NOT be recommended for managing skewed data during join operations in Spark?
You need to create a new Hive table from a Spark DataFrame. What are the different approaches you can consider?