Free Access to Databricks.Associate-Developer-Apache-Spark-3.5.v2025-11-20.q72 with Valid Practice Test (Page 2)

Question 1

54 of 55.
What is the benefit of Adaptive Query Execution (AQE)?

A.It allows Spark to optimize the query plan before execution but does not adapt during runtime.
B.It automatically distributes tasks across nodes in the clusters and does not perform runtime adjustments to the query plan.
C.It optimizes query execution by parallelizing tasks and does not adjust strategies based on runtime metrics like data skew.
D.It enables the adjustment of the query plan during runtime, handling skewed data, optimizing join strategies, and improving overall query performance.

Question 2

Given this view definition:
df.createOrReplaceTempView("users_vw")
Which approach can be used to query the users_vw view after the session is terminated?
Options:

A.Query the users_vw using Spark
B.Persist the users_vw data as a table
C.Recreate the users_vw and query the data using Spark
D.Save the users_vw definition and query using Spark

Question 3

19 of 55.
A Spark developer wants to improve the performance of an existing PySpark UDF that runs a hash function not available in the standard Spark functions library.
The existing UDF code is:
import hashlib
from pyspark.sql.types import StringType
def shake_256(raw):
return hashlib.shake_256(raw.encode()).hexdigest(20)
shake_256_udf = udf(shake_256, StringType())
The developer replaces this UDF with a Pandas UDF for better performance:
@pandas_udf(StringType())
def shake_256(raw: str) -> str:
return hashlib.shake_256(raw.encode()).hexdigest(20)
However, the developer receives this error:
TypeError: Unsupported signature: (raw: str) -> str
What should the signature of the shake_256() function be changed to in order to fix this error?

A.def shake_256(raw: str) -> str:
B.def shake_256(raw: [pd.Series]) -> pd.Series:
C.def shake_256(raw: pd.Series) -> pd.Series:
D.def shake_256(raw: [str]) -> [str]:

Question 4

Which Spark configuration controls the number of tasks that can run in parallel on the executor?
Options:

A.spark.executor.cores
B.spark.task.maxFailures
C.spark.driver.cores
D.spark.executor.memory

Question 5

A data scientist is working on a project that requires processing large amounts of structured data, performing SQL queries, and applying machine learning algorithms. The data scientist is considering using Apache Spark for this task.
Which combination of Apache Spark modules should the data scientist use in this scenario?
Options:

A.Spark DataFrames, Structured Streaming, and GraphX
B.Spark SQL, Pandas API on Spark, and Structured Streaming
C.Spark Streaming, GraphX, and Pandas API on Spark
D.Spark DataFrames, Spark SQL, and MLlib

Question 1

Question 2

Question 3

Question 4

Question 5

Download PDF File