Apache Interview Questions you must look at

This topic contains 0 replies, has 1 voice, and was last updated by Bilal Bilal 1 year ago.

Viewing 1 post (of 1 total)
  • Author
  • #100070471

    Apache Interview Questions you must look at

    Apache Ambari is an open source platform which supports and complements Hadoop for provisioning and managing Hadoop clusters. As per the market survey, Ambari has taken a market share of about 49.30%. Hence from Hadoop administration perspective, it is essential to <u>learn Apache Ambari</u>. Thus if you are preparing for your next Hadoop job interview as a Hadoop administrator make yourself ready to face Apache Ambari Interview Questions.

    Not to mention, the complexity of interview questions on Apache Ambari depends on the roles and responsibilities of the position you have applied. Hence in this blog, we will discuss on some of the best Apache Ambari interview questions based on the complexity levels that we believe will help you in your preparation.

    Some Common Apache Ambari Interview Questions and Answers

    The following questions and answers of Apache Ambari are based on the basic concepts of Ambari and applicable for all roles.

    1. Explain Apache Ambari with its key features.

    Answer: The Apache Ambari is an Apache product designed and developed with a target to simplify Hadoop projects with easy management. Ambari helps to manage Hadoop project concerning

    • Easy provisioning
    • Convenient project management
    • Hadoop cluster monitoring
    • Availability of intuitive interface
    • Support for RESTful API
    • Hadoop management web UI
    1. Why should you use Apache Ambari as a Hadoop user or system administrator?

    Answer: There are multiple benefits that a Hadoop user can achieve by using Apache Ambari.

    Using Ambari a system administrator can –

    • Install Hadoop across any number of hosts using a step-step wizard provided by Ambari while the Ambari handles the configuration for Hadoop installation.
    • Centrally manage the Hadoop services across the cluster using Ambari.
    • Efficiently monitor the status and health of Hadoop cluster leveraging the Ambari metrics system. Additionally, the Ambari alert framework provides the timely notification regarding any system issues like disk space issue or a node running status.
    • Integrate the functionalities mentioned above in an application using Ambari RESTful APIs.
    1. What are the operating systems supported by Apache Ambari?

    Answer: Apache Ambari supports the 64-bit version of the following Operating Systems:

    • CentOS 6 and 7
    • RHEL (Redhat Enterprise Linux) 6 and 7
    • SLES (SuSE Linux Enterprise Server) 11
    • Ubuntu 12 and 14
    • OEL (Oracle Enterprise Linux) 6 and 7
    • Debian 7
    1. Can you explain Apache Ambari architecture?

    Answer: Apache Ambari consists of following major components-

    • Ambari Server
    • Ambari Agent
    • Ambari Web


    Ambari server handles all the metadata, and it consists of an instance of Postgres database as shown in the figure. Each host in the cluster contains one copy of the Ambari agent through which Ambari server controls each host.

    An Ambari agent is an active member of the host which sends heartbeats from the nodes to the Ambari server along with multiple operational metrics to decide the health status of the nodes.

    Ambari Web UI is a client-side JavaScript application which periodically accesses the Ambari RESTful API to perform cluster operations. Moreover, it helps in asynchronous communication between the application and the server using the RESTful API.

    1. How many layers of Hadoop components are supported by Apache Ambari and what are they?

    Answer: There are three layers of Hadoop components which are supported by Apache Ambari, and these are as follows:

    1. Hadoop core components
    • Hadoop Distributed File System (HDFS)
    • MapReduce
    1. Essential Hadoop components
    • Apache Pig
    • Apache Hive
    • Apache HCatalog
    • WebHCat
    • Apache HBase
    • Apache ZooKeeper
    1. Components of Hadoop support
    • Apache Oozie
    • Apache Sqoop
    • Ganglia
    • Nagios
    1. What is a repository in Apache Ambari?

    Answer: A repository is a hosted space for Ambari software packages for downloading and installing purposes. Apache provides different versions of repositories which are OS specific. Moreover, based on internet accessibility, you can access either of the two formats of repositories:

    • Tarball (.tar format, if you don’t have internet access)
    • Repo file(.repo format for temporary internet access)
    1. What are the different types of Ambari repositories?

    Answer: There are mainly four types of Ambari Repositories as listed below –

    1. Ambari:This repository is used for Ambari server, the monitoring software packages, and Ambari agent.
    2. HDP-UTILS:This repository is used for Ambari and HDP utility packages
    3. HDP:The repository to host Hadoop Stack packages
    4. Extra Packages for Enterprise Linux(EPEL): The repository with an additional set of packages for the Enterprise Linux
    5. What is a local repository and when will you use it?

    Answer: A local repository is a hosted space in the local environment for Ambari software packages. This is mainly used when the enterprise clusters have no or limited outbound Internet access.

    1. What are the benefits of setting up a local repository?

    Answer: First and foremost by setting up a local repository, you can access Ambari software packages without internet access. Along with that, you can achieve benefits like –

    • Enhanced governance with better installation performance
    • Routine post-installation cluster operations like service start and restart operations
    1. Explain different life cycle commands in Ambari. 

    Answer: Apache Ambari has a defined set of life cycle commands to add, remove or reconfigure any of the services and these are –

    • Start
    • Stop
    • Install
    • Configure
    • Status
    1. What are the tools you need to build Ambari?

    Answer: Following tools are required to build Ambari –

    • JDK 7
    • Apache Maven 3.3.9 or later
    • Python 2.6 or later
    • Node JS
    • G++
    • Xcode in case of Mac
    1. What are the different tools used for Ambari monitoring purpose?

    Answer: There are two open source monitoring tools in Ambari –

    • Ganglia
    • Nagios
    1. What are the particular functionalities of Ganglia in Ambari?

    Answer: The functionalities of Ganglia in Ambari are –

    • Monitoring the cluster
    • Identify trending patterns
    • Collect the metrics in the clusters
    • To support detailed heatmaps
    1. What are the particular functionalities of Nagiosin Ambari?

    Answer: The functionalities of Nagios in Ambari are –

    • Health checking of the nodes and sending alerts
    • To send alert emails as any of the notifications type or service type.
    1. Explain some of the basic commands used for Apache Ambari server?

    Answer: Following commands are used for Apache Ambari server –

    • To start the Ambari Server:

    ambari-server start

    • To check the Ambari Server processes:

    ps -ef | grep Ambari

    • To stop the Ambari Server:

    ambari-server stop

    Advanced Apache Ambari Interview Questions

    1. What is the latest release of Apache Ambari?

    Answer: The latest release is of Ambari is 2.6.2

    1. What are the new additions in Ambari 2.6 versions?

    Answer: Ambari 2.6.2 added the following features:

    • It will protect Zeppelin Notebook SSL credentials
    • We can set appropriate HTTP headers to use Cloud Object Stores with HDP

    Ambari 2.6.1 added the following feature:

    • Conditional Installation of  LZO packages through Ambari

    Ambari 2.6.0 added the following features:

    • Distributed mode of Ambari Metrics System’s (AMS) along with multiple Collectors
    • Host Recovery improvements for the restart
    • moving masters with minimum impact and scale testing
    • Improvement in Data Archival & Purging in Ambari Infra
    1. What all tasks you can perform for managing host using Ambari host tab?

    Answer: Using Hosts tab, we can perform the following tasks:

    • Analysing Host Status
    • Searching the Hosts Page
    • Performing Host related Actions
    • Managing Host Components
    • Decommissioning a Master node or Slave node
    • Deleting a Component
    • Setting up Maintenance Mode
    • Adding or removing Hosts to a Cluster
    • Establishing Rack Awareness
    1. What all tasks you can perform for managing services using Ambari service tab?

    Answer: Using Services tab, we can perform the following tasks:

    • Start and Stop of All Services
    • Display of Service Operating Summary
    • Adding a Service
    • Configuration Settings change
    • Performing Service Actions
    • Rolling Restarts
    • Background Operations monitoring
    • Service removal
    • Auditing operations
    • Using Quick Links
    • YARN Capacity Scheduler refresh
    • HDFS management
    • Atlas management in a Storm Environment
    1. Can Ambari manage multiple clusters?

    Answer: No, as of now Ambari can manage only one cluster. However, we can remotely view the “views” of other clusters in the same instance.

    1. What are the different ways you can use to secure a cluster using Ambari?

    Answer: Following are the ways that can be used to secure a cluster using Ambari –

    • For network security, we can enable Kerberos authentication from Ambari
    • By installing Ranger and configuring primary authorization from Ambari
    • We can configure Ambari to use Knox SSO
    • We can setup SSL for Ambari
    1. What is Ambari shell and what are the purposes of using it?

    Answer: It is a Java based command line tool that uses Groovy based Ambari REST client, and the Spring Shell framework to execute commands. The shell supports

    • The functionalities available through Ambari web-app
    • context-aware availability of commands
    • completion of tab
    • optional and required parameter support
    1. What is the required action you need to perform if you opt for scheduled maintenance on the cluster nodes?

    Answer: Ambari provides Maintenance mode option for all the nodes in the cluster. Hence before performing maintenance, we can enable the Maintenance mode of Ambari to avoid alerts.

    1. What is the role of “ambari-qa” user?

    Answer: ‘ambari-qa’ is a user account which is created by Ambari on all nodes in the cluster. As part of the installation process, this user performs a service check against cluster services.

    1. Why do you think Apache Ambari would have a promising future?

    Answer: With the increasing demand for big data technologies like Hadoop, we have seen the massive usage of data analysis which brings huge clusters in place. For better management of these clusters with enhanced operational efficiency and more visibility companies are leaning towards the technologies like Apache Ambari. Moreover, we have noticed how technology giant HortonWorks is working on Ambari to make it more scalable. Hence, gaining knowledge of Hadoop along with the technology like Apache Ambari is an added advantage.

    Bottom Line

    We have discussed most of the frequently asked and the top Apache Ambari interview questions above. However, the more you understand and practice a technology, more queries will pop up in your mind. Hence, it is always advisable to master yourself on the subject matter.

Viewing 1 post (of 1 total)

You must be logged in to reply to this topic.

Translate »