1. What is a key characteristic of batch processing systems?
A. They process data interactively as it arrives
B. They perform large-scale computations on a dataset in discrete jobs
C. They require real-time dashboards for each step
D. They must respond to user queries in milliseconds
2. Which framework popularized the batch processing model of "map" and "reduce" functions?
A. Apache Spark
B. MapReduce (Hadoop)
C. Apache Flink
D. Apache Kafka
3. What advantage does a DAG (Directed Acyclic Graph) based execution engine have over traditional MapReduce?
A. It can only run on a single machine
B. It supports iterative processing and more complex data flows
C. It prevents any form of shuffle or partition
D. It eliminates the need for any distributed file system
4. In an incremental batch processing workflow, what is a key challenge?
A. Ensuring the entire dataset is always processed from scratch
B. Managing partial updates and ensuring consistency between old and new data
C. Guaranteeing no data volume changes
D. Only allowing a single job to run at a time