1. What does the term “big data” refer to?
2. What are the three main characteristics of big data known as the “Three Vs”?
3. Which term refers to the process of analyzing large datasets to uncover hidden patterns and insights?
4. What is the primary goal of data preprocessing in big data analysis?
5. What is the role of Hadoop in big data processing?
6. Which programming language is commonly used for big data analysis and processing?
7. What is the purpose of MapReduce in Hadoop?
8. What is the main advantage of using distributed storage systems in big data environments?
9. Which type of data refers to information that is generated in real-time and requires immediate processing?
10. What is the purpose of data partitioning in big data processing?
11. Which of the following is NOT a characteristic of Big Data?
12. What does the 'Volume' aspect of Big Data refer to?
13. What is a key benefit of Big Data analysis?
14. Which of the following is the best description of Big Data?
15. Which of the following statements is true about the relationship between Big Data and traditional data processing?
16. Which of the following challenges is specifically associated with Big Data's velocity?
17. Which type of data does the variety aspect of Big Data primarily address?
18. Which command is used to list the files in a Hadoop directory?
19. A Big Data job is failing due to a lack of sufficient memory. What is the most likely cause?
20. Which of the following is NOT one of the 3Vs of Big Data?
21. Data in ___________ bytes size is called Big Data.
22. How many V's of Big Data?
23. Transaction data of the bank is?
24. In how many forms BigData could be found?
25. Which of the following are Benefits of Big Data Processing?
26. Which of the following are incorrect Big Data Technologies?
27. The overall percentage of the world’s total data has been created just within the past two years is?
28. Apache Kafka is an open-source platform that was created by?
29. What was Hadoop named after?
30. What are the main components of Big Data?
31. All of the following accurately describe Hadoop, EXCEPT ____________
32. __________ has the world’s largest Hadoop cluster.
33. As companies move past the experimental phase with Hadoop, many cite the need for additional capabilities, including _______________
34. Point out the correct statement.
35. According to analysts, for what can traditional IT systems provide a foundation when they’re integrated with big data technologies like Hadoop?
36. Hadoop is a framework that works with a variety of related tools. Common cohorts include ____________
37. Point out the wrong statement.
38. __________ can best be described as a programming model used to develop Hadoop-based applications that can process massive amounts of data.
39. Facebook Tackles Big Data With _______ based on Hadoop.
40. Type of consistency in BASE for NOSQL is
41. An algorithm that divides the entire file of baskets into segments small enough so that all frequent itemset for the segment can be found in main memory is:
42. Which of the following factors have an impact on the Google PageRank?
43. Map function takes which of the following as input:
44. Two k-cliques are adjacent when they share
45. Identify 3V’s of Big Data
46. PCY algorithm is used in the field of big data analytics for
47. Stream Queries are basically questions asked about the current state of the stream or streams is called as
48. Heartbeat is used to communicate between
49. How Bloom’s Filter is different than other filtering algorithms in Data Stream Mining?
50. Which is an important feature of Big Data Analytics?
51. A sparse matrix system that uses a row and a column as keys is called as
52. What do you always have to specify for a MapReduce job?
53. The only security feature that exists in Hadoop is
54. In which of the relational algebra operations, the reduce function is identity?
55. Assume that a text file contains following text. This is a test. Yes it is In a map-reduce logic of finding frequency of occurrence of each word in this file, what is the output of map function?
56. Flajolet-Martin Algorithm depends upon
57. In Decaying window algorithm, we assign
58. In DGIM algorithm,
59. In FM algorithm, For each stream element a, r(a) be the number of _____ in h(a)
60. Euclidean Distance between Age 21 and 24 and Income 500 and 504 is
61. Jaccard Distance between Set1 = {1,0,1,1,1} and Set2 = {1,0,0,1,1} is
62. A Bloom filter consists of an array of n bits, initially all :
63. Algorithm to estimate number of distinct elements seen in the stream.
64. The right end of a bucket in DGIM algorithm is always a position with a
65. A collection of pages whose purpose is to increase the PageRank of a certain page or pages is called a
66. To compute page rank we need to know the


