Data On the World map

Businesses generate vast amounts of data every day. Big Data analytics is a discipline to manage and analyse large amounts of data to yield insights, discover correlations and uncover patterns. If done correctly, it would be possible to get answers from it almost immediately pertaining to business processes or to lead business strategy.

The term "Big Data" has been so often used in today's context to refer to large amounts of data and points to the potential of what that data can be used for. The term is also used as a buzzword ever so often by many who do not understand how it is built or how it generates value. In truth, the field is still being built, defined and understood but great advances have been made to piece together a framework.

Big Data refers to data with sizes over 1 Terabyte and into the Petabyte range. It is no surprise that such data volumes exist in our tech driven world and further exacerbated by the rise of the internet and mobile devices. The impetus for this new revolution is not just the generation of large amounts of data but also the explosion in compute power offered by the cloud that enables this amount of data to be stored and process in a reliable manner. This has led to the proliferation of various AI algorithms that can now ingest large amounts of data to learn the underlying structure behind the data.

This is a large paradigm shift because it used to be that humans needed to program a model to behave a certain way based on a set of constraints or rules. Deep Learning, fueled by large amounts of data, flips this notion around and says that the computer can learn to do a certain task given the right amounts of data. This has lead to new breakthroughs in various industries such as healthcare, finance and consumer goods.

The world's most valuable resource is no longer oil, but data

~ The Economist, 2017

Understanding the Layers of the Value Chain

The following steps provide insight into the journey that every company undertakes on its path to data leadership.

1. Data Engineering

The journey starts by engineering pipelines for data to flow onto our cloud platform of choice. The key to this revolution is data and so the journey begins at the source which is the data. The data goes through a process called Extract, Transform and Load (ETL) which can come in teh form of streaming data from live sources or batch data processing for raw files.

2. Data Storage/Warehousing Tools

Data must then enter a Data Warehouse, a one stop database that collects and stores information from various sources. A Data Lake is sometimes used before a data warehouse as a staging area for ETL or data warehousing. A key feature of this data warehouse is that it is able to cater for Petabyte scale data and perform queries in a fast an efficient manner. An example of such as service is BigQuery by Google Cloud which is used in various Google services such as Gmail or Youtube.

3. Data Visualisation

Once data resides in a data warehouse, value can be extracted by visualising the data using software tools such as Google Data Studio or Tableau. The key to this process is for a Data Scientist to conduct drill downs of data. This normally starts from a business question that is answered by visualising the data as a storyboard.

4. AI / Machine Learning

A further step after visualisation is to build AI models to look for patterns in the underlying data to be able to serve predictions for future requests. For example, an algorithm could predict the best house price to sell a given house based on the location, type of house, amenities and so on. These predictions can be used to offer business value to the customer.

What value can be derived from data?

Data analytics and the value it brings can be understood through the lens of the Analytic Ascendency Model produced by Gartner, a global researcha nd advisory firm.

Gartner Analytic Ascendancy Model

Descriptive Analytics

This is the simplest form of analytics which starts with the fundamental question of "What Happened?"

For example, a retailer could look at statistical measures and visualise the sales for day/week/month/year to understand and report on how well the business has been doing.

Diagnostic Analytics

Moving one step up the ladder is to answer "Why did it happen?"

Diagnostic analytics takes a deeper look at the data by measuring historical data against other data and therefore takes a better look at it. Some commonly used techniques in diagnostic analytics are drill-down, correlations, probabilities, and identifying patterns. By plotting out the data over time, the data can be analysed in context to understand why.

Predictive Analytics

Predictive analytics as the name implies is extrapolating trends into the future. The key to this is to build models using accurate data to forecast important variables which could affect business performance. This is a much more challenging task compared to the previous two as this requires technical knowledge to train and build an effective model that can serve robust predictions and deploy to production. For example, a use case could be to predict production demand.

Prescriptive Analytics

Building on predictive analytics, the final step in this process is to mitigate the risk of gain a future advantage given an accurate prediction of the future. Prescriptive analytics suggests all favourable outcomes and suggests which courses of action needs to be taken to reach a particular outcome. Unlike the previous three types of analytics, prescriptive analytics makes use of a feedback system to learn and improve the relationship between prescribed actions and their outcomes.

An example of this is building recommendation systems whereby each choice of the user continuously updates the model to serve up to date and accurate predictions not just for that user but for similar users.

How can Big Data Be Implemented in My Company?

The key to implementation relies on finding capable talent to build and manage the technologies and the vast amounts of data. This involves either training your workforce or hiring talents externally which are in short supply and are expensive to hire.

BerjutaTech is able to provide value for your business as your external consultant for Big Data Analytics. We are able to provide detailed plans on implementing a Big Data strategy. We are able to architect and build a data lake with a data warehouse for your business needs. We are able to build data ingestion pipelines for your machines on your shop floor. We can also provide training and talent development for your organisation.

Get in touch to find out more.