MCQs Generator

MCQs Generator - Fixed Responsive Header
Home » Directory » 1000 Big Data Technologies MCQ » 150 Big Data Security MCQs

150 Big Data Security MCQs

1. For running Hadoop service daemons in Hadoop in secure mode ___________ principals are required.

a) SSL
b) Kerberos
c) SSH
d) None of the mentioned
✅ Correct Answer: b) Kerberos
📝 Explanation:
Each service reads authenticate information saved in keytab file with appropriate permission.

2. Point out the correct statement.

a) Hadoop does have the definition of group by itself
b) MapReduce JobHistory server run as same user such as mapred
c) SSO environment is managed using Kerberos with LDAP for Hadoop in secure mode
d) None of the mentioned
✅ Correct Answer: c) SSO environment is managed using Kerberos with LDAP for Hadoop in secure mode
📝 Explanation:
You can change a way of mapping by specifying the name of mapping provider as a value of hadoop.security.group.mapping.

3. The simplest way to do authentication is using _________ command of Kerberos.

a) auth
b) kinit
c) authorize
d) all of the mentioned
✅ Correct Answer: b) kinit
📝 Explanation:
HTTP web-consoles should be served by principal different from RPC’s one.

4. Data transfer between Web-console and clients are protected by using _________

a) SSL
b) Kerberos
c) SSH
d) None of the mentioned
✅ Correct Answer: a) SSL
📝 Explanation:
AES offers the greatest cryptographic strength and the best performance.

5. Point out the wrong statement.

a) Data transfer protocol of DataNode does not use the RPC framework of Hadoop
b) Apache Oozie which access the services of Hadoop on behalf of end users need to be able to impersonate end users
c) DataNode must authenticate itself by using privileged ports which are specified by dfs.datanode.address and dfs.datanode.http.address
d) None of the mentioned
✅ Correct Answer: d) None of the mentioned
📝 Explanation:
Authentication is based on the assumption that the attacker won’t be able to get root privileges.

6. In order to turn on RPC authentication in Hadoop, set the value of hadoop.security.authentication property to _________

a) zero
b) kerberos
c) false
d) none of the mentioned
✅ Correct Answer: b) kerberos
📝 Explanation:
Security settings need to be modified properly for robustness.

7. The __________ provides a proxy between the web applications exported by an application and an end user.

a) ProxyServer
b) WebAppProxy
c) WebProxy
d) None of the mentioned
✅ Correct Answer: b) WebAppProxy
📝 Explanation:
If security is enabled it will warn users before accessing a potentially unsafe web application. Authentication and authorization using the proxy is handled just like any other privileged web application.

8. ___________ used by YARN framework which defines how any container launched and controlled.

a) Container
b) ContainerExecutor
c) Executor
d) All of the mentioned
✅ Correct Answer: b) ContainerExecutor
📝 Explanation:
The container process has the same Unix user as the NodeManager.

9. The ____________ requires that paths including and leading up to the directories specified in yarn.nodemanager.local-dirs.

a) TaskController
b) LinuxTaskController
c) LinuxController
d) None of the mentioned
✅ Correct Answer: b) LinuxTaskController
📝 Explanation:
LinuxTaskController keeps track of all paths and directories on datanode.

10. The configuration file must be owned by the user running _________

a) DataManager
b) NodeManager
c) ValidationManager
d) None of the mentioned
✅ Correct Answer: b) NodeManager
📝 Explanation:
To recap, local file-system permissions need to be modified.

11. What is data privacy?

a) Data privacy is the protection of personal data
b) Users who should not have access to it
c) The ability of individuals to determine who can access their personal information
d) All of the above mentioned
✅ Correct Answer: d) All of the above mentioned
📝 Explanation:
Data privacy is the protection of personal data. Its principle shows that the people have control over how their personal information is gathered, processed, and shared by companies that have access to it.

12. What is personal data?

a) Information that relates to a specific person
b) It can't be access by unauthorised people
c) Both A and B
d) None of the above mentioned
✅ Correct Answer: c) Both A and B
📝 Explanation:
Data that is related to a specific person or thing is called personal data. It can be accessible by the people or organisations that have its access rights under General Data Protection Regulation (GDPR).

13. Amongst which of the following is a component of data privacy?

a) Management of data risk
b) Data loss prevention
c) Password management
d) All of the above mentioned
✅ Correct Answer: d) All of the above mentioned
📝 Explanation:
Data privacy includes management of data risk, data loss prevention and password management. Hence, all the mentioned points are the key components of data privacy.

14. Information that directly or indirectly links to a person is considered as

a) PII
b) PIII
c) IIP
d) IPI
✅ Correct Answer: a) PII
📝 Explanation:
Information that directly or indirectly links to a person is considered as Personally Identifiable Information (PII). PII is a type of information that can be used alone or with other to identify an individual.

15. Amongst which of the following is/are example of PII information?

a) A full name
b) A Social Security number
c) A physical address
d) All of the above mentioned
✅ Correct Answer: d) All of the above mentioned
📝 Explanation:
A full name, a social security number, a physical address are example of PII information.

16. Brajesh Shrivas lives at 60, City Center, Gwalior, India; this data record is an example of

a) PII
b) PIII
c) IIP
d) IPI
✅ Correct Answer: a) PII
📝 Explanation:
In the above record name, address, city and country mentioned of a specific person which describe identity of an individual. This type of information is Personally Identifiable Information (PII).

17. What is pseudonymization?

a) A process of removing personal identifiers from data
b) Replacing identifiers with placeholder values
c) Protecting personal privacy or improving data security
d) All of the above mentioned
✅ Correct Answer: d) All of the above mentioned
📝 Explanation:
Pseudonymization is a process of removing personal identifiers from data; it replaces identifiers with placeholder values and protects personal privacy or improving data security.

18. When sensitive data falls into the hands of someone who is unauthorised person, it is

a) Data breach
b) Data access
c) Data control
d) None of the above mentioned
✅ Correct Answer: a) Data breach
📝 Explanation:
A data breach is a security incident in which unauthorized parties gain access to sensitive or critical data or its exposure to an unauthorized party.

19. What is General Data Protection Regulation (GDPR)?

a) GDPR is a regulation on data protection and privacy in the European Union (EU)
b) Its applicable to the organization that collects or processes personal data of people
c) It gives rights to individuals to their personal data and to simplify the regulatory environment for international business
d) All of the above mentioned
✅ Correct Answer: d) All of the above mentioned
📝 Explanation:
GDPR is a regulation on data protection and privacy in the European Union (EU). Organizations that collect or processes personal data of people have to follow GDPR regulations or its principles. It gives rights to individuals to their personal data and to simplify the regulatory environment for international business.

20. Amongst which of the following is/are the main principle of GDPR?

a) Lawfulness, fairness, and transparency
b) Purpose limitation, Data minimization, Accuracy
c) Storage limitation, Integrity and confidentiality (security), Accountability
d) All of the above mentioned
✅ Correct Answer: d) All of the above mentioned
📝 Explanation:
Followings are the main seven principles of GDPR: Lawfulness, fairness, and transparency; Purpose limitation; Data minimization; Accuracy; Storage limitation; Integrity and confidentiality (security); Accountability.

21. Amongst which of the following is/are the action point that data controllers and processors need to takes under GDPR?

a) Record keeping, Security measures
b) Data breach notification, Data Protection Officer (DPO)
c) Both A and B
d) None of the above mentioned
✅ Correct Answer: c) Both A and B
📝 Explanation:
Under GDPR following action points are taken: Record keeping, Security measures, Data breach notification, Data Protection Officer (DPO).

22. How many types of fines are imposed by GDPR on businesses that violate its policies?

a) 1
b) 2
c) 3
d) None of the above mentioned
✅ Correct Answer: b) 2
📝 Explanation:
First tier and second tier are two types of fines imposed by GDPR on businesses that violate its policies.

23. Amongst which of the following is/are true with respect to the category of violation GDPR?

a) First tier: A violation results in a maximum fine of either €10 million or 2% of the business's worldwide annual revenue, whichever is higher
b) Second tier: A violation results in a maximum fine of either €20 million or 4% of the business's worldwide annual revenue, whichever is higher
c) Both A and B
d) None of the above mentioned
✅ Correct Answer: c) Both A and B
📝 Explanation:
If any organisation violates GDPR rules; results in a maximum fine of either €10 million or 2% annual revenue of the business or a maximum fine of either €20 million or 4% annual revenue of the business which is higher.

24. Which of the following is true about “Right to rectification” under GDPR?

a) Data subjects can correct inaccurate data about themselves
b) Data subjects cannot change
c) Data subjects have the right to obtain a copy of collected personal data
d) None of the above mentioned
✅ Correct Answer: a) Data subjects can correct inaccurate data about themselves
📝 Explanation:
The right to rectification describes data protection right that data subjects can correct inaccurate data about themselves. An individual can challenge the correctness of their personal information to the organisation that held it to be updated or removed in this way.

25. Which of the following is true about “Right to Data Portability” under GDPR?

a) Data subjects must be given easy-to-understand information
b) Data subjects can transfer their data from one data controller to another
c) Data subjects can request that their data be deleted
d) None of the above mentioned
✅ Correct Answer: b) Data subjects can transfer their data from one data controller to another
📝 Explanation:
Right to Data Portability allows individuals to obtain and reuse their personal data and they can move, copy or transfer their data from one data controller to another in a safe and secure way, without affecting its usability.

26. Which of the following is true about “Right of access” under GDPR?

a) Data subjects can request that their data be deleted
b) Data subjects must be given easy-to-understand information
c) Data subjects have the right to obtain a copy of collected personal data
d) None of the above mentioned
✅ Correct Answer: c) Data subjects have the right to obtain a copy of collected personal data
📝 Explanation:
Right of access describes that data subjects have the right to obtain a copy of collected personal data.

27. In a storage unit like database, if a record is save as “Person 17332”; this will be

a) Pseudonymization
b) Personally Identifiable Information
c) General Data Protection Regulation
d) None of the above mentioned
✅ Correct Answer: a) Pseudonymization
📝 Explanation:
Pseudonymisation is a process of data management that replaces personally identifiable information with one or more artificial identifiers known as pseudonyms. In above question, Person 17332 is a pseudonym (type of encrypted data) which has replaced its originality so that unwanted users cannot understand and utilise it.

28. What is a basic difference between data privacy and data security?

a) Data privacy means personal information and data security refers to the process of protecting data
b) Data privacy refers to the process of protecting data and data security means personal information
c) Both A and B
d) None of the above mentioned
✅ Correct Answer: a) Data privacy means personal information and data security refers to the process of protecting data
📝 Explanation:
Data privacy means personal information; it defines policies concerning data management, data processing, data storage, data sharing or networking and usage of personal information while data security refers to the process of protecting data personal information.

29. Information privacy, Individual privacy and communication privacy is the three main pillars of

a) Digital Integrity
b) Digital protection
c) Digital secrecy
d) Digital privacy
✅ Correct Answer: d) Digital privacy
📝 Explanation:
Digital privacy includes Information privacy, Individual privacy and communication privacy.

30. Which of the following is a private Search-engine?

a) Yahoo
b) DuckDuckGo
c) Google
d) None of the above mentioned
✅ Correct Answer: b) DuckDuckGo
📝 Explanation:
DuckDuckGo is a private Search-engine. A private search engine allows us to conduct online searches without logging user details or keeping track of browsing sessions. Your information will not be sold or shared without your explicit permission.

31. ____ is a process of retaining data at a secure place to the long-time storage.

a) Copies of data
b) Off-site backup
c) Data archiving
d) None of the above mentioned
✅ Correct Answer: c) Data archiving
📝 Explanation:
Data archiving is a process of retaining data at a secure place to the long-time storage.

32. Amongst which of the following is/are true about selective archiving?

a) Archive only a selective part of data because not all data is equally important
b) Data are constantly buffered but require explicit input to be archived
c) People can dynamically negotiate with their own policies around control
d) All of the above mentioned
✅ Correct Answer: d) All of the above mentioned
📝 Explanation:
Selective archive selects only a selective part of important data; not all.

33. Amongst which of the following is/are true to secure disposal of data?

a) Keep careful records
b) Destroy the device
c) Destroy the data
d) All of the above mentioned
✅ Correct Answer: d) All of the above mentioned
📝 Explanation:
Secure data disposal is a process of permanently and securely removing sensitive or confidential data from storage devices to prevent unauthorized access or recovery. When data is deleted using standard methods, it is often still recoverable using specialized software or techniques. Secure data disposal ensures that the data is irreversibly destroyed, making it nearly impossible to retrieve.

34. Amongst which of the following is/are true about Eliminate potential clues?

a) It can provide crucial clues to a security cracker to break into our network and the systems that reside on it
b) We clear the configuration settings from networking equipment
c) Both A and B
d) All of the above mentioned
✅ Correct Answer: c) Both A and B
📝 Explanation:
Eliminating potential is a policy of removing or minimizing identifying information or sensitive data that could be used to identify individuals or compromise their privacy. In this, we clear the configuration settings from networking equipment's and provide crucial clues to a security cracker to break into our network and the systems that reside on it.

35. Amongst which of the following is/are a change taken place in GDPR?

a) The individual must be informed of exactly what their data is being used for
b) Organisations must inform the person of their right to withdraw consent at any time
c) Both A and B
d) None of the above mentioned
✅ Correct Answer: c) Both A and B
📝 Explanation:
GDPR is a rules regulation on data privacy. Amendments in GDPR are taken place time to time. There was a new update; it was the individual must be informed of exactly what their data is being used for and the organisations

36. What is the primary purpose of data masking in Big Data security?

a) To hide data from unauthorized users
b) To convert unstructured data into structured data
c) To increase data volume
d) To reduce data velocity
✅ Correct Answer: a) To hide data from unauthorized users
📝 Explanation:
Data masking replaces sensitive data with realistic but fictional data to protect it from unauthorized access while allowing testing and development.

37. Which technique is used for encryption at rest in HDFS?

a) Transparent Data Encryption
b) SSL/TLS
c) Kerberos
d) OAuth
✅ Correct Answer: a) Transparent Data Encryption
📝 Explanation:
HDFS Transparent Encryption allows encrypting data at rest without changing the application code.

38. What is the role of Apache Ranger in Big Data security?

a) Data ingestion
b) Centralized authorization
c) Stream processing
d) Graph analytics
✅ Correct Answer: b) Centralized authorization
📝 Explanation:
Apache Ranger provides centralized security administration and fine-grained authorization for Hadoop components.

39. Which protocol is commonly used for authentication in Hadoop clusters?

a) LDAP
b) Kerberos
c) SAML
d) JWT
✅ Correct Answer: b) Kerberos
📝 Explanation:
Kerberos is the standard for authentication in secure Hadoop deployments.

40. What does ABAC stand for in access control models?

a) Attribute-Based Access Control
b) Authentication-Based Access Control
c) Audit-Based Access Control
d) Application-Based Access Control
✅ Correct Answer: a) Attribute-Based Access Control
📝 Explanation:
ABAC uses attributes of users, resources, and environment to make access decisions.

41. In Big Data, what is tokenization?

a) Replacing sensitive data with tokens
b) Encrypting data with keys
c) Hashing data for integrity
d) Compressing data for storage
✅ Correct Answer: a) Replacing sensitive data with tokens
📝 Explanation:
Tokenization substitutes sensitive data with non-sensitive tokens that can be mapped back to the original data.

42. Which feature of Spark provides authentication for its web UI?

a) Shared secret
b) Kerberos
c) OAuth
d) All of the above
✅ Correct Answer: d) All of the above
📝 Explanation:
Spark supports multiple authentication mechanisms for securing its components.

43. What is the purpose of auditing in Big Data systems?

a) To monitor data access and changes
b) To increase data velocity
c) To reduce data variety
d) To store metadata
✅ Correct Answer: a) To monitor data access and changes
📝 Explanation:
Auditing logs user actions for compliance and security incident detection.

44. Which regulation focuses on data protection in the EU?

a) HIPAA
b) GDPR
c) SOX
d) PCI-DSS
✅ Correct Answer: b) GDPR
📝 Explanation:
GDPR is the General Data Protection Regulation for EU data privacy.

45. What is dynamic data masking?

a) Masking data at query time
b) Masking data during storage
c) Masking data during ingestion
d) Masking data after deletion
✅ Correct Answer: a) Masking data at query time
📝 Explanation:
Dynamic data masking obscures sensitive data in real-time based on user privileges.

46. In Kafka, what secures data in transit?

a) SSL/TLS
b) AES encryption
c) Kerberos
d) All of the above
✅ Correct Answer: d) All of the above
📝 Explanation:
Kafka uses SSL for transport, AES for at-rest, and Kerberos for auth.

47. What is a common threat in Big Data environments?

a) Insider threats
b) SQL injection
c) DDoS attacks
d) All of the above
✅ Correct Answer: d) All of the above
📝 Explanation:
Big Data systems face various threats due to their scale and complexity.

48. Which tool is used for fine-grained access control in Hive?

a) Sentry
b) Ranger
c) Both a and b
d) None
✅ Correct Answer: c) Both a and b
📝 Explanation:
Both Apache Sentry and Ranger provide SQL-based authorization for Hive.

49. What is data anonymization?

a) Removing identifiers irreversibly
b) Replacing with pseudonyms
c) Encrypting data
d) Hashing data
✅ Correct Answer: a) Removing identifiers irreversibly
📝 Explanation:
Anonymization ensures data cannot be linked back to individuals.

50. In HDFS, permissions are based on what model?

a) RBAC
b) POSIX-like
c) ABAC
d) DAC
✅ Correct Answer: b) POSIX-like
📝 Explanation:
HDFS uses owner, group, and others with read/write/execute permissions.

51. What secures RPC in Hadoop?

a) Kerberos
b) SSL
c) Both
d) None
✅ Correct Answer: c) Both
📝 Explanation:
RPC uses SASL with Kerberos, and SSL can be layered for encryption.

52. Which is a compliance standard for healthcare Big Data?

a) HIPAA
b) GDPR
c) CCPA
d) FERPA
✅ Correct Answer: a) HIPAA
📝 Explanation:
HIPAA protects health information in the US.

53. What is key management in Big Data encryption?

a) Managing encryption keys
b) Generating data keys
c) Rotating keys
d) All of the above
✅ Correct Answer: d) All of the above
📝 Explanation:
Key management ensures secure handling of cryptographic keys.

54. In Spark, how is executor security enforced?

a) YARN ACLs
b) Kerberos delegation
c) Both
d) None
✅ Correct Answer: c) Both
📝 Explanation:
Spark uses YARN for resource management and Kerberos for delegation tokens.

55. What is differential privacy?

a) Adding noise to protect individual data
b) Encrypting datasets
c) Masking columns
d) Tokenizing rows
✅ Correct Answer: a) Adding noise to protect individual data
📝 Explanation:
Differential privacy provides privacy guarantees in data analysis.

56. Which Apache project provides auditing for Hadoop?

a) Falcon
b) Atlas
c) Ranger
d) All
✅ Correct Answer: d) All
📝 Explanation:
These projects support auditing features in the ecosystem.

57. What is the CIA triad in security?

a) Confidentiality, Integrity, Availability
b) Confidentiality, Integrity, Authentication
c) Confidentiality, Integrity, Authorization
d) Confidentiality, Identification, Availability
✅ Correct Answer: a) Confidentiality, Integrity, Availability
📝 Explanation:
CIA triad is the foundational model for information security.

58. In Big Data, what is a data lakehouse security challenge?

a) Schema enforcement
b) Access control on raw data
c) Both
d) None
✅ Correct Answer: c) Both
📝 Explanation:
Lakehouses combine data lakes and warehouses, requiring robust security.

59. What is OAuth used for in Big Data?

a) API authorization
b) User authentication
c) Data encryption
d) Key distribution
✅ Correct Answer: a) API authorization
📝 Explanation:
OAuth delegates access to resources without sharing credentials.

60. Which is a risk in multi-tenant Big Data clusters?

a) Data leakage between tenants
b) Resource contention
c) Both
d) None
✅ Correct Answer: c) Both
📝 Explanation:
Multi-tenancy requires isolation to prevent cross-tenant issues.

61. What is static data masking?

a) Masking in a copy of production data
b) Masking at runtime
c) Masking during transmission
d) Masking after analysis
✅ Correct Answer: a) Masking in a copy of production data
📝 Explanation:
Static masking creates de-identified datasets for non-production use.

62. In HBase, security is managed via what?

a) ACLs
b) RBAC
c) Kerberos
d) All
✅ Correct Answer: d) All
📝 Explanation:
HBase integrates with Hadoop security features.

63. What is the purpose of KMS in Hadoop?

a) Key Management Service
b) Data compression
c) Query optimization
d) Load balancing
✅ Correct Answer: a) Key Management Service
📝 Explanation:
Hadoop KMS manages encryption keys for HDFS.

64. Which standard is for payment card data in Big Data?

a) PCI-DSS
b) GDPR
c) HIPAA
d) SOX
✅ Correct Answer: a) PCI-DSS
📝 Explanation:
PCI-DSS ensures secure handling of cardholder data.

65. What is federated identity management?

a) Single sign-on across domains
b) Local user management
c) Password hashing
d) Biometric auth
✅ Correct Answer: a) Single sign-on across domains
📝 Explanation:
Federation allows trust between identity providers.

66. In Big Data, what is a zero-trust model?

a) Verify every access request
b) Trust internal network
c) No authentication
d) Static permissions
✅ Correct Answer: a) Verify every access request
📝 Explanation:
Zero-trust assumes no implicit trust, even inside the network.

67. What protects against SQL injection in Big Data queries?

a) Parameterized queries
b) Encryption
c) Masking
d) Auditing
✅ Correct Answer: a) Parameterized queries
📝 Explanation:
Parameterization prevents malicious code injection.

68. Which is used for secure multi-party computation in Big Data?

a) Homomorphic encryption
b) Symmetric encryption
c) Hash functions
d) Digital signatures
✅ Correct Answer: a) Homomorphic encryption
📝 Explanation:
It allows computation on encrypted data without decryption.

69. What is the role of Apache Knox in Big Data?

a) Gateway for secure access
b) Data storage
c) Processing engine
d) Monitoring tool
✅ Correct Answer: a) Gateway for secure access
📝 Explanation:
Knox provides a REST API gateway with authentication.

70. In GDPR, what is data minimization?

a) Collect only necessary data
b) Store all data
c) Share widely
d) Ignore consent
✅ Correct Answer: a) Collect only necessary data
📝 Explanation:
Minimize data collection to what's required for purpose.

71. What is a blockchain use in Big Data security?

a) Immutable audit logs
b) Data compression
c) Query execution
d) Resource allocation
✅ Correct Answer: a) Immutable audit logs
📝 Explanation:
Blockchain ensures tamper-proof records.

72. Which cipher is recommended for Big Data encryption?

a) AES-256
b) DES
c) RC4
d) MD5
✅ Correct Answer: a) AES-256
📝 Explanation:
AES-256 is a strong symmetric encryption standard.

73. What is role-based access control (RBAC)?

a) Access based on user roles
b) Access based on attributes
c) Access based on time
d) Access based on location
✅ Correct Answer: a) Access based on user roles
📝 Explanation:
RBAC assigns permissions to roles, users get roles.

74. In cloud Big Data, what is shared responsibility model?

a) Provider secures infrastructure, user secures data
b) Provider secures all
c) User secures all
d) No security
✅ Correct Answer: a) Provider secures infrastructure, user secures data
📝 Explanation:
Shared model divides security duties between cloud provider and customer.

75. What is k-anonymity in privacy protection?

a) Each record indistinguishable from k-1 others
b) Encrypting k records
c) Deleting k attributes
d) Hashing k times
✅ Correct Answer: a) Each record indistinguishable from k-1 others
📝 Explanation:
K-anonymity generalizes data to prevent identification.

76. Which tool enforces policies in Kafka?

a) Ranger
b) Sentry
c) Atlas
d) Falcon
✅ Correct Answer: a) Ranger
📝 Explanation:
Ranger integrates with Kafka for authorization.

77. What is a DDoS threat to Big Data systems?

a) Overloading ingestion pipelines
b) Encrypting data
c) Masking queries
d) Auditing logs
✅ Correct Answer: a) Overloading ingestion pipelines
📝 Explanation:
DDoS can disrupt availability of Big Data services.

78. What is certificate pinning in Big Data?

a) Binding to specific certificates for TLS
b) Storing certs in HDFS
c) Generating certs dynamically
d) Revoking certs
✅ Correct Answer: a) Binding to specific certificates for TLS
📝 Explanation:
Pinning prevents man-in-the-middle attacks.

79. In Big Data, what is secure multi-tenancy?

a) Isolating tenants' data and compute
b) Sharing all resources
c) No isolation
d) Single tenant only
✅ Correct Answer: a) Isolating tenants' data and compute
📝 Explanation:
Ensures tenants cannot access others' data.

80. What is the purpose of data lineage in security?

a) Tracking data flow for compliance
b) Speeding up queries
c) Compressing data
d) Partitioning tables
✅ Correct Answer: a) Tracking data flow for compliance
📝 Explanation:
Lineage helps audit and ensure data handling compliance.

81. Which is a vulnerability in unsecured Hadoop?

a) Unauthorized file access
b) Job submission by anyone
c) Both
d) None
✅ Correct Answer: c) Both
📝 Explanation:
Without security, Hadoop exposes data and resources.

82. What is SAML in Big Data auth?

a) Security Assertion Markup Language
b) Simple Access Method Layer
c) Secure Authentication Markup Language
d) System Access Markup Language
✅ Correct Answer: a) Security Assertion Markup Language
📝 Explanation:
SAML enables single sign-on for federated identity.

83. What is the main goal of Big Data security?

a) Protect data confidentiality, integrity, availability
b) Increase processing speed
c) Reduce storage costs
d) Improve analytics accuracy
✅ Correct Answer: a) Protect data confidentiality, integrity, availability
📝 Explanation:
Security ensures the core principles of information protection.

84. In Spark, what is dynamic allocation security implication?

a) Potential resource hijacking
b) No implication
c) Faster execution
d) Less memory use
✅ Correct Answer: a) Potential resource hijacking
📝 Explanation:
Dynamic allocation requires proper isolation to prevent abuse.

85. What is HIPAA compliance in Big Data?

a) Protecting health data
b) Financial reporting
c) EU privacy
d) Card security
✅ Correct Answer: a) Protecting health data
📝 Explanation:
HIPAA safeguards protected health information.

86. Which encryption type for data in transit?

a) TLS/SSL
b) AES
c) RSA
d) SHA
✅ Correct Answer: a) TLS/SSL
📝 Explanation:
TLS secures communications over networks.

87. What is Apache Sentry?

a) Role-based authorization for SQL
b) Data catalog
c) Workflow scheduler
d) Monitoring tool
✅ Correct Answer: a) Role-based authorization for SQL
📝 Explanation:
Sentry provides fine-grained access control for Hive and Impala.

88. What is a privacy-preserving technique using noise?

a) Local differential privacy
b) Tokenization
c) Hashing
d) Salting
✅ Correct Answer: a) Local differential privacy
📝 Explanation:
Adds noise to individual data before aggregation.

89. In YARN, what enforces container security?

a) LinuxContainerExecutor
b) DefaultContainerExecutor
c) Both
d) None
✅ Correct Answer: a) LinuxContainerExecutor
📝 Explanation:
It runs containers as specific users for isolation.

90. What is CCPA?

a) California Consumer Privacy Act
b) Canadian Cyber Protection Act
c) Centralized Compliance Protocol Agreement
d) Cloud Computing Privacy Agreement
✅ Correct Answer: a) California Consumer Privacy Act
📝 Explanation:
CCPA gives California residents control over personal data.

91. What is secure boot in Big Data nodes?

a) Verifying firmware integrity
b) Encrypting boot data
c) Masking boot logs
d) Auditing boot process
✅ Correct Answer: a) Verifying firmware integrity
📝 Explanation:
Prevents rootkits by ensuring trusted boot chain.

92. Which is a Big Data insider threat mitigation?

a) Principle of least privilege
b) Full access grants
c) No auditing
d) Shared credentials
✅ Correct Answer: a) Principle of least privilege
📝 Explanation:
Limits user access to necessary resources only.

93. What is format-preserving encryption?

a) Encryption maintaining data format
b) Compressing encrypted data
c) Hashing formats
d) Tokenizing formats
✅ Correct Answer: a) Encryption maintaining data format
📝 Explanation:
Allows encryption without changing data structure.

94. In Big Data, what is SIEM?

a) Security Information and Event Management
b) System Integration Event Module
c) Secure Identity Encryption Method
d) Single Instance Event Manager
✅ Correct Answer: a) Security Information and Event Management
📝 Explanation:
SIEM collects and analyzes security logs.

95. What is the benefit of column-level encryption?

a) Selective data protection
b) Full table encryption
c) Row encryption only
d) No encryption
✅ Correct Answer: a) Selective data protection
📝 Explanation:
Encrypts specific columns for granular security.

96. Which protocol for secure shell in Big Data clusters?

a) SSH
b) HTTP
c) FTP
d) SMTP
✅ Correct Answer: a) SSH
📝 Explanation:
SSH provides secure remote access to nodes.

97. What is data sovereignty in Big Data?

a) Data stored in compliant jurisdictions
b) Data movement freedom
c) Data deletion rights
d) Data sharing policies
✅ Correct Answer: a) Data stored in compliant jurisdictions
📝 Explanation:
Ensures data residency meets legal requirements.

98. What is a canary token in security?

a) Decoy for detecting breaches
b) Encryption key
c) Audit log
d) Access token
✅ Correct Answer: a) Decoy for detecting breaches
📝 Explanation:
Triggers alerts when accessed by unauthorized users.

99. In Spark SQL, security is via what?

a) Hive authorization
b) Direct HDFS access
c) No security
d) Custom plugins only
✅ Correct Answer: a) Hive authorization
📝 Explanation:
Integrates with Hive for table-level security.

100. What is the purpose of threat modeling in Big Data?

a) Identify potential risks
b) Encrypt data
c) Mask queries
d) Audit access
✅ Correct Answer: a) Identify potential risks
📝 Explanation:
Systematically evaluates threats and mitigations.

101. Which is a quantum-resistant encryption for future Big Data?

a) Lattice-based cryptography
b) AES-128
c) RSA-1024
d) DES
✅ Correct Answer: a) Lattice-based cryptography
📝 Explanation:
Resists attacks from quantum computers.

102. What is RBAC in HDFS?

a) Not native, via external tools
b) Built-in
c) Deprecated
d) Optional only
✅ Correct Answer: a) Not native, via external tools
📝 Explanation:
HDFS uses POSIX, RBAC via Ranger or Sentry.

103. What is the impact of unpatched Big Data software?

a) Exploitable vulnerabilities
b) No impact
c) Faster performance
d) Less storage
✅ Correct Answer: a) Exploitable vulnerabilities
📝 Explanation:
Patches fix known security issues.

104. What is secure deletion in Big Data?

a) Overwriting data multiple times
b) Simple delete
c) Archiving
d) Compressing
✅ Correct Answer: a) Overwriting data multiple times
📝 Explanation:
Ensures data cannot be recovered.

105. In Kafka, ACLs are for what?

a) Topic access control
b) Message encryption
c) Partition balancing
d) Consumer grouping
✅ Correct Answer: a) Topic access control
📝 Explanation:
Authorizes operations on topics.

106. What is the principle of defense in depth?

a) Multiple layered security controls
b) Single strong control
c) No controls
d) Reactive only
✅ Correct Answer: a) Multiple layered security controls
📝 Explanation:
Provides redundancy if one layer fails.

107. Which is a Big Data security best practice?

a) Regular key rotation
b) Static keys forever
c) Shared keys
d) No backups
✅ Correct Answer: a) Regular key rotation
📝 Explanation:
Reduces risk if keys are compromised.

108. What is attribute-based encryption (ABE)?

a) Encryption based on attributes
b) Symmetric encryption
c) Hashing
d) Digital signing
✅ Correct Answer: a) Encryption based on attributes
📝 Explanation:
Allows decryption only if attributes match policy.

109. In Big Data, what is PII de-identification?

a) Removing or obscuring personal info
b) Collecting more PII
c) Sharing PII
d) Ignoring PII
✅ Correct Answer: a) Removing or obscuring personal info
📝 Explanation:
Prevents re-identification of individuals.

110. What is the role of IDS in Big Data?

a) Intrusion Detection System
b) Data Integration Service
c) Identity Detection System
d) Index Data Service
✅ Correct Answer: a) Intrusion Detection System
📝 Explanation:
Monitors for malicious activities.

111. Which compliance for financial data?

a) SOX
b) GDPR
c) HIPAA
d) FERPA
✅ Correct Answer: a) SOX
📝 Explanation:
Sarbanes-Oxley Act for financial reporting integrity.

112. What is secure coding in Big Data apps?

a) Avoiding vulnerabilities like injection
b) Hardcoding credentials
c) Ignoring errors
d) Using deprecated libs
✅ Correct Answer: a) Avoiding vulnerabilities like injection
📝 Explanation:
Follows secure development practices.

113. What is a honey pot in Big Data security?

a) Decoy system to attract attackers
b) Encryption tool
c) Masking technique
d) Audit software
✅ Correct Answer: a) Decoy system to attract attackers
📝 Explanation:
Distracts and studies attacker behavior.

114. In HDFS, what is encryption zone?

a) Directory with transparent encryption
b) Unencrypted area
c) Compressed zone
d) Archived zone
✅ Correct Answer: a) Directory with transparent encryption
📝 Explanation:
All files in the zone are automatically encrypted.

115. What is the GDPR right to be forgotten?

a) Erase personal data on request
b) Keep data forever
c) Share data widely
d) Anonymize only
✅ Correct Answer: a) Erase personal data on request
📝 Explanation:
Allows individuals to request data deletion.

116. Which is used for secure data sharing in Big Data?

a) Federated learning
b) Centralized storage
c) No sharing
d) Plain text
✅ Correct Answer: a) Federated learning
📝 Explanation:
Trains models without sharing raw data.

117. What is vulnerability scanning in Big Data?

a) Identifying weaknesses in systems
b) Encrypting scans
c) Masking vulnerabilities
d) Deleting scans
✅ Correct Answer: a) Identifying weaknesses in systems
📝 Explanation:
Helps prioritize security fixes.

118. What is the benefit of immutable infrastructure?

a) Reduces configuration drift risks
b) Allows frequent changes
c) Increases complexity
d) No backups needed
✅ Correct Answer: a) Reduces configuration drift risks
📝 Explanation:
Treats servers as disposable for security.

119. In Big Data, what is context-aware access control?

a) Access based on user context like location
b) Static access
c) Role only
d) No control
✅ Correct Answer: a) Access based on user context like location
📝 Explanation:
Enhances security with dynamic factors.

120. What is a security information model for Big Data?

a) MITRE ATT&CK
b) CIA only
c) Volume model
d) Variety framework
✅ Correct Answer: a) MITRE ATT&CK
📝 Explanation:
Framework for adversary tactics and techniques.

121. Which is a post-quantum algorithm?

a) Kyber
b) RSA
c) ECC
d) Blowfish
✅ Correct Answer: a) Kyber
📝 Explanation:
Lattice-based key encapsulation for quantum resistance.

122. What is data classification in security?

a) Categorizing data by sensitivity
b) Encrypting all data
c) Deleting low sensitivity
d) Sharing classified only
✅ Correct Answer: a) Categorizing data by sensitivity
📝 Explanation:
Guides appropriate protection levels.

123. In Spark, what secures shuffle data?

a) Authentication tokens
b) No security
c) Full encryption
d) Compression only
✅ Correct Answer: a) Authentication tokens
📝 Explanation:
Uses tokens to authenticate shuffle files.

124. What is FERPA for Big Data?

a) Family Educational Rights and Privacy Act
b) Financial Encryption Regulation
c) Federal Employee Retirement Privacy Act
d) Foreign Exchange Reporting Privacy Act
✅ Correct Answer: a) Family Educational Rights and Privacy Act
📝 Explanation:
Protects student education records.

125. What is penetration testing in Big Data?

a) Simulated attacks to find vulnerabilities
b) Data penetration
c) Query testing
d) Storage testing
✅ Correct Answer: a) Simulated attacks to find vulnerabilities
📝 Explanation:
Validates security controls.

126. What is just-in-time access?

a) Temporary elevated privileges
b) Permanent access
c) No access
d) Audit only
✅ Correct Answer: a) Temporary elevated privileges
📝 Explanation:
Reduces attack surface by limiting time.

127. In HDFS, what is block access token?

a) Auth for reading blocks
b) Encryption key
c) Compression token
d) Deletion token
✅ Correct Answer: a) Auth for reading blocks
📝 Explanation:
Validates client access to data blocks.

128. What is the purpose of security baselines?

a) Standard configurations for security
b) Custom per node
c) No standards
d) Only for dev
✅ Correct Answer: a) Standard configurations for security
📝 Explanation:
Ensures consistent hardening.

129. Which is a Big Data security metric?

a) Mean time to detect (MTTD)
b) Data volume
c) Processing speed
d) Storage cost
✅ Correct Answer: a) Mean time to detect (MTTD)
📝 Explanation:
Measures security incident response effectiveness.

130. What is confidential computing?

a) Processing encrypted data in use
b) Storing encrypted data
c) Transmitting encrypted data
d) Deleting data
✅ Correct Answer: a) Processing encrypted data in use
📝 Explanation:
Protects data during computation.

131. In Ranger, what is tag-based policy?

a) Policies on data tags
b) Fixed policies
c) No policies
d) User only
✅ Correct Answer: a) Policies on data tags
📝 Explanation:
Dynamic classification for access control.

132. What is a supply chain attack in Big Data?

a) Compromising third-party tools
b) Direct node attack
c) Data poisoning
d) Query injection
✅ Correct Answer: a) Compromising third-party tools
📝 Explanation:
Targets dependencies like libraries.

133. What is the role of DLP in Big Data?

a) Data Loss Prevention
b) Data Load Processing
c) Dynamic Link Protocol
d) Distributed Ledger Processing
✅ Correct Answer: a) Data Loss Prevention
📝 Explanation:
Prevents unauthorized data exfiltration.

134. Which hashing algorithm for integrity in Big Data?

a) SHA-256
b) MD5
c) CRC32
d) All
✅ Correct Answer: a) SHA-256
📝 Explanation:
Secure hash for detecting tampering.

135. What is micro-segmentation in Big Data security?

a) Network isolation at workload level
b) Full network access
c) No segmentation
d) VLAN only
✅ Correct Answer: a) Network isolation at workload level
📝 Explanation:
Limits lateral movement in clusters.

136. What is the GDPR data protection impact assessment?

a) Risk evaluation for high-risk processing
b) Annual audit
c) No assessment
d) Post-incident
✅ Correct Answer: a) Risk evaluation for high-risk processing
📝 Explanation:
Identifies and mitigates privacy risks.

137. In Big Data, what is secure code repository access?

a) Using Git with RBAC
b) Public repos
c) No version control
d) Shared folders
✅ Correct Answer: a) Using Git with RBAC
📝 Explanation:
Controls who can view and modify code.

138. What is behavioral analytics in security?

a) Detecting anomalies in user behavior
b) Static rule matching
c) No monitoring
d) Log deletion
✅ Correct Answer: a) Detecting anomalies in user behavior
📝 Explanation:
Uses ML to identify insider threats.

139. Which is a key rotation best practice?

a) Automate and monitor
b) Manual only
c) Never rotate
d) Rotate daily
✅ Correct Answer: a) Automate and monitor
📝 Explanation:
Ensures timely updates without disruption.

140. What is the purpose of security orchestration?

a) Automating incident response
b) Manual triage
c) No response
d) Data collection only
✅ Correct Answer: a) Automating incident response
📝 Explanation:
Coordinates tools for faster remediation.

141. In Big Data, what is homomorphic encryption benefit?

a) Compute on ciphertext
b) Faster decryption
c) Smaller keys
d) No computation
✅ Correct Answer: a) Compute on ciphertext
📝 Explanation:
Enables analytics without exposing plaintext.

142. What is a privacy by design principle?

a) Embed privacy in system design
b) Add later
c) Ignore privacy
d) Compliance only
✅ Correct Answer: a) Embed privacy in system design
📝 Explanation:
Proactive approach to data protection.

143. Which tool for Big Data lineage and security?

a) Apache Atlas
b) Oozie
c) Hue
d) Zeppelin
✅ Correct Answer: a) Apache Atlas
📝 Explanation:
Tracks metadata and governance.

144. What is the risk of shadow IT in Big Data?

a) Unauthorized data stores
b) No risk
c) Faster innovation
d) Less cost
✅ Correct Answer: a) Unauthorized data stores
📝 Explanation:
Bypasses security controls.

145. What is certificate management in TLS?

a) Issuing, renewing, revoking certs
b) Generating keys only
c) No management
d) Static certs
✅ Correct Answer: a) Issuing, renewing, revoking certs
📝 Explanation:
Maintains trust in encrypted connections.

146. In Big Data, what is secure analytics?

a) Analyzing without compromising privacy
b) Plain text analysis
c) No analysis
d) Local only
✅ Correct Answer: a) Analyzing without compromising privacy
📝 Explanation:
Uses techniques like federated learning.

147. What is the PCI-DSS requirement for Big Data?

a) Protect cardholder data
b) Health data only
c) Educational data
d) No requirements
✅ Correct Answer: a) Protect cardholder data
📝 Explanation:
12 requirements for secure payment processing.

148. What is a blue team in Big Data security?

a) Defensive security operations
b) Offensive testing
c) No team
d) Development team
✅ Correct Answer: a) Defensive security operations
📝 Explanation:
Focuses on detection and response.

149. What is data redaction?

a) Permanently removing sensitive info
b) Temporary masking
c) Encrypting
d) Hashing
✅ Correct Answer: a) Permanently removing sensitive info
📝 Explanation:
Irreversibly hides data in outputs.

150. In YARN, what is delegation token?

a) Long-lived auth for jobs
b) Short-lived only
c) No tokens
d) User passwords
✅ Correct Answer: a) Long-lived auth for jobs
📝 Explanation:
Allows jobs to run without constant Kerberos tickets.

151. What is the benefit of immutable logs?

a) Tamper-evident auditing
b) Editable logs
c) No logs
d) Compressed logs
✅ Correct Answer: a) Tamper-evident auditing
📝 Explanation:
Prevents alteration of security records.

152. Which is a Big Data encryption key hierarchy?

a) EZ keys > DEKs > Master key
b) All same level
c) No hierarchy
d) User keys only
✅ Correct Answer: a) EZ keys > DEKs > Master key
📝 Explanation:
Manages keys at different scopes.

153. What is threat hunting in Big Data?

a) Proactive search for threats
b) Reactive only
c) No hunting
d) Data hunting
✅ Correct Answer: a) Proactive search for threats
📝 Explanation:
Uses analytics to find hidden adversaries.

154. What is the role of SOAR in security?

a) Security Orchestration, Automation, Response
b) Simple Operations and Reporting
c) Secure Object Access Rights
d) System Optimization and Recovery
✅ Correct Answer: a) Security Orchestration, Automation, Response
📝 Explanation:
Automates workflows for efficiency.

155. In Big Data, what is synthetic data for security?

a) Generated data mimicking real without PII
b) Real data copy
c) Encrypted real
d) No data
✅ Correct Answer: a) Generated data mimicking real without PII
📝 Explanation:
Used for testing without privacy risks.

156. What is a security rating for Big Data vendors?

a) Assesses third-party risk
b) No rating
c) Performance only
d) Cost rating
✅ Correct Answer: a) Assesses third-party risk
📝 Explanation:
Helps evaluate supplier security posture.

157. What is the principle of least functionality?

a) Run only necessary services
b) Run all services
c) No principles
d) Max functionality
✅ Correct Answer: a) Run only necessary services
📝 Explanation:
Minimizes attack surface.

158. Which is a Big Data security framework?

a) NIST Cybersecurity Framework
b) Hadoop only
c) Spark framework
d) No framework
✅ Correct Answer: a) NIST Cybersecurity Framework
📝 Explanation:
Guides risk management.

159. What is endpoint detection and response (EDR)?

a) Monitors nodes for threats
b) Network only
c) Cloud only
d) No detection
✅ Correct Answer: a) Monitors nodes for threats
📝 Explanation:
Provides visibility into endpoint activities.

160. In Big Data, what is secure supply chain?

a) Verifying component integrity
b) Open source only
c) No verification
d) Trusted blindly
✅ Correct Answer: a) Verifying component integrity
📝 Explanation:
Prevents tampered dependencies.

161. What is the GDPR accountability principle?

a) Demonstrate compliance
b) Ignore records
c) No proof
d) Self-audit only
✅ Correct Answer: a) Demonstrate compliance
📝 Explanation:
Organizations must show how they meet requirements.

162. What is chaos engineering for security?

a) Testing resilience to failures
b) Causing failures
c) No testing
d) Static testing
✅ Correct Answer: a) Testing resilience to failures
📝 Explanation:
Includes security failure scenarios.

163. Which is a post-incident security measure?

a) Root cause analysis
b) Ignore incident
c) Delete logs
d) No measure
✅ Correct Answer: a) Root cause analysis
📝 Explanation:
Learns from breaches to improve.

164. What is secure multi-cloud strategy?

a) Consistent security across providers
b) Provider-specific only
c) Single cloud
d) No strategy
✅ Correct Answer: a) Consistent security across providers
📝 Explanation:
Avoids vendor lock-in with uniform policies.

165. In Big Data, what is privacy-enhancing technology (PET)?

a) Tools like homomorphic encryption
b) No tech
c) Basic encryption
d) Masking only
✅ Correct Answer: a) Tools like homomorphic encryption
📝 Explanation:
Enables data use while preserving privacy.

166. What is the final layer in defense in depth?

a) Data encryption
b) Perimeter firewall
c) Access control
d) All layers needed
✅ Correct Answer: d) All layers needed
📝 Explanation:
Multiple layers provide comprehensive protection.
Previous: 130 Big Data Storage and Data Processing MCQs
Next: 160 Big Data Real-time Processing, Streaming Data, and Batch Processing - MCQs
New100 Big Data Analytics MCQs Questions

60 Big Data Analytics MCQs Questions

1. For running Hadoop service daemons in Hadoop in secure mode ___________ principals are required. a) SSL b) Kerberos c)…

By MCQs Generator
New50 Big Data Architecture Important MCQs

70 Big Data Architecture Important MCQs

1. For running Hadoop service daemons in Hadoop in secure mode ___________ principals are required. a) SSL b) Kerberos c)…

By MCQs Generator
NewBig Data Real-time Processing, Streaming Data, and Batch Processing - MCQs

160 Big Data Real-time Processing, Streaming Data, and Batch Processing - MCQs

100 multiple-choice questions explores key concepts in Big Data processing paradigms, including real-time processing with tools such as Apache Storm…

By MCQs Generator

Detailed Explanation ×

Loading usage info...

Generating comprehensive explanation...