As atom is to human body so is data to Enterprise App Development. Remember minions working for “Gru” in the animated movie - Despicable Me? Well, Data Scientists are no different. No matter how much “data science/scientist” term seems fancy, but they do tend to make sense of the vast stores of big data. The data scientists are basically combined skills of software programmer and statistician. They extract the vital pieces of information hidden under heap of data. The traditional method of transforming data into knowledge relies heavily on analysis and interpretation. Healthcare sector for example, it is common practice for specialists to analyze current trends and changes in health-care data. Essentially, it is done on a quarterly basis. A report is compiled providing detailed analysis to the sponsoring health-care organization. This report becomes Holy Grail for future plans for health-care management and plays crucial role in the decision making process. Irrespective of the domain, be it science, marketing, finance, health care, retail, information technology or any other field, the classical approach banks upon one or more analysts. They must extensively involve with the data and act as an interface between the data and the users and products. But the roadblock is that manual probing of a data set is highly subjective apart from being slow and expensive. On the other hand, Big Data are extremely large scaled data in terms of quantity and complexity. Data is being accumulated at a very fast pace. There is a desperate need for a new age computational theories and tools to assist humans in extracting useful information (knowledge) from the rapidly growing volumes of digital data. At a surface level, the KDD (Knowledge Discovery in Databases) field is targeted for making sense of available data. The KDD process addresses the foundation of the database of mapping low-level data into more comprehendible report. The sole purpose is the application of specific data-mining methods for pattern discovery and extraction. Major KDD application areas include marketing, fraud detection, telecommunication and manufacturing. This encompasses data storage and access, scaling algorithms to massive data sets and interpreting results. Artificial intelligence also supports KDD by discovering pragmatic laws from experimentation. The outcome should be in line with valid data that possess a degree of certainty. The newly recognized patterns are considered new knowledge. Kai Hwang in his book (Cloud Computing for Machine Learning and Cognitive Applications) explained the steps involved in the entire KDD process as: Identify the goal of the KDD process from the customer’s perspective. Understand application domains involved and the knowledge that is required Select a target data set or subset of data samples on which discovery is to be performed. Cleanse and preprocess data by deciding strategies to handle missing fields and alter the data as per the requirements. Simplify the data sets by removing unwanted variables. Then, further analyze useful features that can be used to represent the data, depending on the goal. Suggest hidden patterns by matching KDD goals with data mining methods. Choose data mining algorithms to discover hidden patterns. This process includes deciding which models and parameters might be appropriate for the overall KDD process. Search for patterns of interest in a particular representational form, which include classification rules or trees, regression and clustering. Interpret essential knowledge from the mined patterns. Use the knowledge and incorporate it into another system for further action. Document it and make reports for interested parties. The traditional statistical methods worked well with small chunk of data sets. However, that approach miserably fails while working with huge databases. Apparently it involves millions of rows and scores of columns of data. Scalability becomes a huge pain while mining data. Even best of the best big data/hadoop developers face major technical challenge while developing models that can do a better job analyzing data, detecting non-linear relationships and interaction between elements. Extensive use of analytics, data, and fact-based decision making largely influences the competition between companies. Instead of relying on traditional approach, most companies leverage statistical and quantitative analysis. Predictive modeling becomes pivotal element of competition. According to Gartner, the self-learning (ML-powered) intelligent systems will continue to reign supreme in the technology marathon through 2020. As companies strive for a more “connected world”, there will be overlap of physical and digital layers around us. Gartner forecast in the field of Data Science for 2018: Smart Apps: For next several years there will be a rise in AI-driven apps and services. All managed software platforms are currently competing race for AI integration in their existing systems for enhanced performance and value addition. This trend includes the use of digital assistants and virtual services. Intelligent Things: To make human life easy, technology will progress on “Intelligent Things”. Basically they are semi-robotic - smarter versions of regular gadgets and equipments. Digital Twins: In the context of IOT, Digital twins will bring together the connected world of sensors and humans. Security: In a complex security environment of digital business better decision making process companies will follow a sophisticated tech strategy known as “Continuous Adaptive Risk and Trust Assessment” (CARTA) The basic premise of CARTA is trust. Edge Computing: The computing infrastructure that exists close to the sources of data, for example, industrial machines (e.g. wind turbine, magnetic resonance (MR) scanner, undersea blowout preventers), industrial controllers such as SCADA systems, and time series databases aggregating data from a variety of equipment and sensors. So, information processing and content collection and delivery are placed closer to the sources of this information. Intelligent Platforms: Human expectations from digital systems that propels machine intelligence, will rise in the coming future. Artificial Intelligence: 59% of organizations are still working on their AI strategies while the remaining 41% of the organizations are already in the running in AI space. Augmented Reality: AR has already revolutionized the world around us. The human-machine interaction will improve as research breakthroughs in AR and VR come about. Blockchain: It is believed that Blockchain will eventually mature and become backbone of the Information Technology domain in the coming years. It will ease out the transactional part by enabling un-trusted parties. It will be predominant in sectors like finance, healthcare, and content delivery. Serverless Computing: Serverless computing is a cloud computing execution model in which the cloud provider dynamically manages the allocation of machine resources. Industry experts say that Serverless computing will have as big an impact on software development as Unity3D had on Game Development. For the next decade or two the wave will be Big Data, Artificial Intelligence (AI), Machine Learning (ML), along with some newer technologies like Blockchain, Edge Computing, Serverless Computing and others within the Data Science industry. Last year LinkedIn made some interesting findings about Big Data or Hadoop developers: Job growth in the next decade is expected to outstrip growth during the previous decade, creating 11.5M jobs by 2026, according to the U.S. Bureau of Labor Statistics. Machine Learning Engineers, Data Scientists, and Big Data Engineers rank among the top emerging jobs on LinkedIn. Data scientist roles have grown over 650% since 2012. Currently, 35,000 people in the US have data science skills, while hundreds of companies are hiring for those roles. There are currently 1,829 open Machine Learning Engineering positions on LinkedIn.