Free Access to Databricks.Associate-Developer-Apache-Spark-3.5.v2025-11-20.q72 with Valid Practice Test (Page 6)

Question 21

What is the benefit of Adaptive Query Execution (AQE)?

A.It allows Spark to optimize the query plan before execution but does not adapt during runtime.
B.It enables the adjustment of the query plan during runtime, handling skewed data, optimizing join strategies, and improving overall query performance.
C.It optimizes query execution by parallelizing tasks and does not adjust strategies based on runtime metrics like data skew.
D.It automatically distributes tasks across nodes in the clusters and does not perform runtime adjustments to the query plan.

Question 22

What is the behavior for function date_sub(start, days) if a negative value is passed into the days parameter?

A.The same start date will be returned
B.An error message of an invalid parameter will be returned
C.The number of days specified will be added to the start date
D.The number of days specified will be removed from the start date

Question 23

You have:
DataFrame A: 128 GB of transactions
DataFrame B: 1 GB user lookup table
Which strategy is correct for broadcasting?

A.DataFrame B should be broadcasted because it is smaller and will eliminate the need for shuffling itself
B.DataFrame B should be broadcasted because it is smaller and will eliminate the need for shuffling DataFrame A
C.DataFrame A should be broadcasted because it is larger and will eliminate the need for shuffling DataFrame B
D.DataFrame A should be broadcasted because it is smaller and will eliminate the need for shuffling itself

Question 24

39 of 55.
A Spark developer is developing a Spark application to monitor task performance across a cluster.
One requirement is to track the maximum processing time for tasks on each worker node and consolidate this information on the driver for further analysis.
Which technique should the developer use?

A.Broadcast a variable to share the maximum time among workers.
B.Configure the Spark UI to automatically collect maximum times.
C.Use an RDD action like reduce() to compute the maximum time.
D.Use an accumulator to record the maximum time on the driver.

Question 25

A data engineer wants to create an external table from a JSON file located at/data/input.jsonwith the following requirements:
Create an external table namedusers
Automatically infer schema
Merge records with differing schemas
Which code snippet should the engineer use?
Options:

A.CREATE TABLE users USING json OPTIONS (path '/data/input.json')
B.CREATE EXTERNAL TABLE users USING json OPTIONS (path '/data/input.json')
C.CREATE EXTERNAL TABLE users USING json OPTIONS (path '/data/input.json', mergeSchema
'true')
D.CREATE EXTERNAL TABLE users USING json OPTIONS (path '/data/input.json', schemaMerge
'true')

Question 21

Question 22

Question 23

Question 24

Question 25

Download PDF File