Free Access to Databricks.Associate-Developer-Apache-Spark-3.5.v2025-11-20.q72 with Valid Practice Test (Page 8)

Question 31

A data engineer is streaming data from Kafka and requires:
Minimal latency
Exactly-once processing guarantees
Which trigger mode should be used?

A..trigger(processingTime='1 second')
B..trigger(continuous=True)
C..trigger(continuous='1 second')
D..trigger(availableNow=True)

Question 32

What is the benefit of using Pandas on Spark for data transformations?
Options:

A.It is available only with Python, thereby reducing the learning curve.
B.It computes results immediately using eager execution, making it simple to use.
C.It runs on a single node only, utilizing the memory with memory-bound DataFrames and hence cost-efficient.
D.It executes queries faster using all the available cores in the cluster as well as provides Pandas's rich set of features.

Question 33

A data engineer is asked to build an ingestion pipeline for a set of Parquet files delivered by an upstream team on a nightly basis. The data is stored in a directory structure with a base path of "/path/events/data". The upstream team drops daily data into the underlying subdirectories following the convention year/month/day.
A few examples of the directory structure are:

Which of the following code snippets will read all the data within the directory structure?

A.df = spark.read.option("inferSchema", "true").parquet("/path/events/data/")
B.df = spark.read.option("recursiveFileLookup", "true").parquet("/path/events/data/")
C.df = spark.read.parquet("/path/events/data/*")
D.df = spark.read.parquet("/path/events/data/")

Question 34

A Spark application is experiencing performance issues in client mode because the driver is resource-constrained.
How should this issue be resolved?

A.Add more executor instances to the cluster
B.Increase the driver memory on the client machine
C.Switch the deployment mode to cluster mode
D.Switch the deployment mode to local mode

Question 35

49 of 55.
In the code block below, aggDF contains aggregations on a streaming DataFrame:
aggDF.writeStream \
.format("console") \
.outputMode("???") \
.start()
Which output mode at line 3 ensures that the entire result table is written to the console during each trigger execution?

A.AGGREGATE
B.COMPLETE
C.REPLACE
D.APPEND

Question 31

Question 32

Question 33

Question 34

Question 35

Download PDF File