Free Access to Cloudera.CDP-3002.v2025-11-21.q109 with Valid Practice Test (Page 13)

Question 56

You need to design your Airflow DAG for data quality checks to be scalable and manageable as the number of datasets and checks grows. How can you achieve this?

A.Hardcode all data quality checks and data sources directly within the DAG code.
B.Utilize Airflow variables to store configuration details like data source paths and check thresholds.
C.Implement a modular design using sub-DAGs, where each sub-DAG encapsulates the data quality checks for a specific dataset.
D.Leverage external configuration files (e.g., YAML or JSON) to define data quality checks and associated parameters.

Question 57

Your Airflow DAG involves tasks that require access to specific resources like databases or external services. How can you ensure these resources are available and properly configured for the DAG execution?

A.Hardcode connection details (credentials, URLs) directly within the DAG code.
B.Utilize Airflow connections to store and manage resource details securely.
C.Implement custom logic within each task to dynamically discover and connect to resources.
D.Grant everyone access to all resources to avoid potential configuration issues.

Question 58

How can you secure your data pipelines within the Cloudera Data Engineering service to ensure data privacy and compliance?

A.Rely solely on access control lists (ACLs) defined on individual data assets.
B.Implement encryption for data at rest and in transit but ignore user access control.
C.Employ a layered security approach combining access control, encryption, and audit logging.
D.Utilize Cloudera Manager security features without additional configuration within the Data Engineering service.

Question 59

What is the purpose of partitioning data in Spark?

A.To improve data compression efficiency
B.To enable parallel processing across multiple nodes
C.To enforce data access control
D.To optimize data visualization

Question 60

You need to design an Airflow DAG that waits for a specific file to become available before proceeding with the downstream tasks. How can you achieve this dependency?

A.Use the File sensor operator to check for the file's existence and trigger downstream tasks upon its arrival.
B.Implement a custom loop within a Python operator to continuously check for the file until it appears.
C.Configure the source system to notify Airflow when the file is ready for processing.
D.Schedule the DAG to run periodically, hoping the file becomes available eventually.

Question 56

Question 57

Question 58

Question 59

Question 60

Download PDF File