Skip to content
This repository was archived by the owner on Jun 23, 2022. It is now read-only.

Latest commit

 

History

History
34 lines (21 loc) · 2.65 KB

File metadata and controls

34 lines (21 loc) · 2.65 KB

What is Security

Computer security is a critical part of conducting business in today’s information processing economy. Companies need to take a pro-active view when it comes to the setup, configuration and administration of their information systems. At its core, security should be looking to protect the privacy and the integrity of the data.

Over the years technology has evolved such that systems are becoming more complex. The advancement of cloud computing and service oriented architecture make it hard for IT architects and professionals to design and understand potential security risks and impact which could compromise their systems.

When looking into designing secure systems, one has to think in term of:

  1. Confidentiality - only the right person can access the service or information
  2. Integrity - need to guarantee service or information has not been altered in any wayavailability - need to guarantee
  3. Availability need to garenteed serviceo is available when requested

Security Consideration

Security is not a simple feature but a set of layers in the overall system. The goal should be to manage and mitigate risks of an attack. Such layers can be thought as:

  1. Physical - control who, where and how someone can access a given resource, site, etc.
  2. Technical - technology such as encryption, authentication, auditing that restrict or control access
  3. Administrative - how processes, accounts are managed

Security for Hadoop

Hadoop offers a set of rich and powerful functionality to manage and process vast amounts of data. The distributed nature of such a system, the network topology and the number of different services offered present a challenge when its comes to designing and controlling such a system in a secure manner. System/Security architects and administrators will need to consider:

  1. Authentication and Authorization - Are the right users accessing services or data and what operations can they do
  2. Data protection - such as encryption for data at rest or in motion
  3. Network and Control access - What services can and should be exposed. Use of reverse-proxy to minimize risk and control access (single point).
  4. Auditing - Record and verify who, what and when.
  5. Administration - Setting up, installing and managing a secure environment.

Conclusion

In this security guide we will present how Hadoop architects, administrators, security specialists need to consider an overall approach to securing Hadoop clusters. In this guide we will explore the best practices regarding what security considerations you might need to consider and how you can implement such control in an efficient manner.