About Spark SQL&Hive difference and connection, which of the following statements is correct?
The overall process of Kafka Producer reading data is that the Producer connects to any surviving Broker, requests the leader metadata information of the specified topic and partition, and then directly connects to the corresponding Broker to publish the data.
In the Spark SQL table, there are often many small files (the size is much smaller than the HDFS block size). In this case, Spark will start more tasks to process these small files. When there is a Shuffle operation in the SQL logic, it will greatly increase the number of hash buckets, which seriously affects performance.
Which of the following statement about the segment file in Kafka Logs is correct? (Multiple choice)
Enter your email address to download Huawei.H13-711_V3.0-ENU.v2022-12-27.q378 Dumps