Data, in today’s business and technology world, is indispensable. Data analytics is the process of examining the large data sets to underline insights and patterns. Under the Digital India Initiative there is an increased focus on developing IT enabled solutions to improve service delivery. Governments at all levels – Centre, State and local bodies are making significant investments in e-Governance solutions to realize this vision. This shift to digital ecosystem has led to tremendous growth in data related to various aspects of Government functions and services. Advanced analytical techniques and tools make this possible and offer new ways in which the data can be mined to generate insights, from retrospective analysis to prospective analysis, helping the decision makers look into the future and plan accordingly.
At its core a data analytics platform requires a robust infrastructure capable to store and process huge amount of data. The Data Analytics Service of NICSI enables user to build an infrastructure of such capabilities. The infrastructure shall be hosted in the NICSI National Cloud and provides an alternative to setting up capital-intensive in-house data analytics infrastructure. Depending on the intended use user can either choose Hadoop (Big Data) or ELK stack to build their ICT infrastructure.
Hadoop is a framework under umbrella of Big Data that helps in handling the voluminous and variety of data with fast pace, where traditional ways are failing to handle. It takes the support of multiple machines to run the process parallelly in a distributed manner. Hadoop family itself consists of multiple tool and technologies depending upon need like HDFS (Hadoop Distributed File System), Scoop, Hive, Pig, Spark, Mahout etc.
The ELK Stack is a collection of three open-source products — Elasticsearch, Logstash, and Kibana.
E stands for Elasticsearch: used for storing logs
L stands for Logstash: used for both shipping as well as processing and storing logs
K stands for Kibana: a visualization tool.
ELK Stack is designed to allow users to take to data from any source, in any format, and to search, analyze, and visualize that data in real time. This solution makes applications, more powerful to work in complex search requirements or demands.
The choice of stack should depend on the data type, volume, and use case, one is working on. ELK stack is best suited for simple searching and web analytics. Whereas Hadoop stack is suited for the use cases that requires scaling, capability to handle high volume of data and compatibility with third-party tools.
While the National Cloud provides the means to build infrastructure to process the data analytics workload, NICSI has also established the CENTRE OF EXCELLENCE FOR DATA ANALYTICS (CEDA) (http://ceda.gov.in) to assist government organizations to derive insights from their data. CEDA provides world class Data Analytics services to Government in an efficient and secure manner through its repository of world class tools and technologies. As a part of its service offerings, it will help the departments
To define their analytic needs
Identify the data sets that are required to meet the analytic needs
Determine access to the relevant data sources (both within as well as outside the government)
Build the required data analytic solutions
In integrating departmental data silos and deliver an integrated analytics for an integrated policy formulation
Setup and processing of large data sets using Big data solutions
Development and deployment of Business Intelligence solution in terms of dashboard
Use of Machine learning algorithms for advance analytics
Registered Cloud users may Click here and submit their Service Request (SR) to avail the above service, whereas new users ( i.e. users not yet registered for cloud ) are requested to first apply for the Cloud Registration with refering the On-boarding procedure.