Free Access to Cloudera.CDP-3002.v2025-11-21.q109 with Valid Practice Test (Page 11)

Question 46

In the context of Cloudera's Optimization Framework, what role does data statistics collection play?

A.It provides metadata for security enforcement
B.It is used to generate more data
C.It helps the optimizer make informed decisions about data layout and query execution plans
D.It reduces the need for data compression

Question 47

A data engineer needs to query a table stored in Apache Hive using SparkSQL. Which of the following commands correctly retrieves data from a Hive table named 'sales data'?

A.
B.
C.
D.

Question 48

You have an Airflow DAG that includes tasks for data extraction, transformation, and loading. You notice that the transformation tasks are computationally intensive and are causing delays in the DAG's execution. To optimize performance, you decide to offload these tasks to a cloud-based service that can scale dynamically. Which approach ensures minimal changes to the DAG structure while integrating this optimization?

A.Replace the transformation tasks with HttpSensor tasks that trigger the cloud service and poll for completion.
B.Use the ExternalTaskSensor to wait for the transformation to complete on the cloud service before proceeding.
C.Modify the transformation tasks to use the PythonOperator to make API calls to the cloud service, handling the transformation.
D.Implement the transformation tasks as DockerOperator tasks, with each task running in a containerized environment on the cloud service.

Question 49

In Apache Airflow, how can you dynamically generate tasks for each table in your database that needs a quality check?

A.Use the SubDagOperator to create a sub-DAG for each table.
B.Use the Variable feature to store a list of tables and iterate over them with a PythonOperator.
C.Utilize the Dynamic Task Mapping feature to create a task for each table.
D.Implement a BranchPythonOperator to create branches for each table dynamically.

Question 50

You want to track changes to an Iceberg table over time for auditing purposes. Which combination of Iceberg features would best support this?

A.Snapshots and partition evolution
B.Snapshots and manifest lists
C.Metadata tables and time travel
D.Hidden partitioning and Iceberg audit logs

Question 46

Question 47

Question 48

Question 49

Question 50

Download PDF File