Free Access to Databricks.Databricks-Certified-Professional-Data-Engineer.v2024-05-28.q108 with Valid Practice Test (Page 18)

Question 81

A production cluster has 3 executor nodes and uses the same virtual machine type for the driver and executor.
When evaluating the Ganglia Metrics for this cluster, which indicator would signal a bottleneck caused by code executing on the driver?

A.The five Minute Load Average remains consistent/flat
B.Bytes Received never exceeds 80 million bytes per second
C.Total Disk Space remains constant
D.Network I/O never spikes
E.Overall cluster CPU utilization is around 25%

Question 82

To reduce storage and compute costs, the data engineering team has been tasked with curating a series of aggregate tables leveraged by business intelligence dashboards, customer-facing applications, production machine learning models, and ad hoc analytical queries.
The data engineering team has been made aware of new requirements from a customer-facing application, which is the only downstream workload they manage entirely. As a result, an aggregate table used by numerous teams across the organization will need to have a number of fields renamed, and additional fields will also be added.
Which of the solutions addresses the situation while minimally interrupting other teams in the organization without increasing the number of tables that need to be managed?

A.Send all users notice that the schema for the table will be changing; include in the communication the logic necessary to revert the new table schema to match historic queries.
B.Configure a new table with all the requisite fields and new names and use this as the source for the customer-facing application; create a view that maintains the original data schema and table name by aliasing select fields from the new table.
C.Create a new table with the required schema and new fields and use Delta Lake's deep clone functionality to sync up changes committed to one table to the corresponding table.
D.Replace the current table definition with a logical view defined with the query logic currently writing the aggregate table; create a new table to power the customer-facing application.
E.Add a table comment warning all users that the table schema and field names will be changing on a given date; overwrite the table in place to the specifications of the customer-facing application.

Question 83

Where are Interactive notebook results stored in Databricks product architecture?

A.Data plane
B.Control plane
C.Data and Control plane
D.JDBC data source
E.Databricks web application

Question 84

The security team is exploring whether or not the Databricks secrets module can be leveraged for connecting to an external database.
After testing the code with all Python variables being defined with strings, they upload the password to the secrets module and configure the correct permissions for the currently active user. They then modify their code to the following (leaving all other variables unchanged).

Which statement describes what will happen when the above code is executed?

A.The connection to the external table will fail; the string "redacted" will be printed.
B.An interactive input box will appear in the notebook; if the right password is provided, the connection will succeed and the encoded password will be saved to DBFS.
C.An interactive input box will appear in the notebook; if the right password is provided, the connection will succeed and the password will be printed in plain text.
D.The connection to the external table will succeed; the string value of password will be printed in plain text.
E.The connection to the external table will succeed; the string "redacted" will be printed.

Question 85

A junior data engineer has been asked to develop a streaming data pipeline with a grouped aggregation using DataFramedf. The pipeline needs to calculate the average humidity and average temperature for each non-overlapping five-minute interval. Events are recorded once per minute per device.
Streaming DataFramedfhas the following schema:
"device_id INT, event_time TIMESTAMP, temp FLOAT, humidity FLOAT"
Code block:

Choose the response that correctly fills in the blank within the code block to complete this task.

A.to_interval("event_time", "5 minutes").alias("time")
B.window("event_time", "5 minutes").alias("time")
C."event_time"
D.window("event_time", "10 minutes").alias("time")
E.lag("event_time", "10 minutes").alias("time")

Question 81

Question 82

Question 83

Question 84

Question 85

Download PDF File