Blog

Introduction

Everyone is aware of the role data has, in providing uncountable benefits to society, businesses and individuals. 

When we think of data, we think of health data collected by our smart watches or entertainment data collected by our likes and dislikes on social media. This high amount of data is known as Big data. 

In recent years, it is not enough to talk about Big data anymore and other types of data have become increasingly important, such as small data and sparse data

Small data is the opposite of Big data. In simple words, we refer to small data when we deal with data that is presented in an accessible, informative and actionable format which addresses one specific issue

On the other hand, sparse data refers to the instance in which a variable has a high number of cells that do not contain actual data. For example, an example of variable without data can be excel or Google sheets rows with a few cells empty or N/A. Even though these cells are empty, they still occupy storage space in the file. 

For example, if you’re trying to make predictions to understand the number of people at a specific station then the data collected would not make enough data to use machine learning. Therefore, additional data points could mean that the view of that specific station is skewed so you want to make the data you do have available valuable.

This is a simple but effective way of showing the differences. 

Value of small data

Smaller data inputs could also mean the dataset is incomplete, such as missing values that create gaps in the dataset making it hard work with. 

Nevertheless, there is value in sparse data, even for machine learning. For every data set that includes what we consider “big data”, there’s a huge amount of small data that goes to waste. 

For example, let’s say that a meteorologist is tracking snowfalls throughout the year. They want to study three specific variables: 

  1. The duration of the snow 
  2. The intensity of the snow 
  3. The temperature of the snow

To better track their data, they create a spreadsheet in which they place the three variables as columns and the months as rows. As it happens, there will be months in which it does not snow and no data will be stored for said months. On the other hand, data will be recorded for the months in which it snowed. The result is a sparse dataset as there are useful data points as well as useless data points. 

Research has shown that if small data is analysed through human factors, numerous insights can be gathered. These insights are notoriously more accurate, transparent and valuable. 

The correct recipe to extract value from small data is to understand the role of the individual data point inside a large dataset. 

For example, if you’re trying to understand how to efficiently manage your assets, small data is what you need. A retail company might want to meet the needs of customers by estimating how much staff to have in the store depending on the time of day. Usually, mornings are busier as everyone is buying their meals for the office so more staff is needed than in the afternoon. 

This is how small data works inside a big dataset. 

Role of small data in time-based prediction 

Small data can be powerful when its value has been correctly extracted. One of the ways of doing so is by using time-based prediction. 

Time-based prediction is the process of analysing time-dependent data which means that a specific time period is examined with the aim of predicting data that belongs to a different period, e.g. the future. 

For example, a retail shop might want to predict when parking spaces are more likely to be available so they can attract disability customers when parking it more likely to be easier

Normally, machine learning models require thousands and thousands of data points. Big data can be complex and hard to manage. Data collection creates issues of speed, can create gaps and often inconsistencies in how it is collected can silo the data into smaller data values as part of the bigger whole

On the other hand, small data can provide decision-makers with important insights when they are conducting time-based predictions. Startups and SMEs can understand future business challenges by using the limited data that they store without engaging in an expensive data collection process.

Small data’s role in business 

Small data plays a big role in business, pun intended

According to the World Economic Forum, small data presents numerous benefits to companies: 

  • Small data enables companies to understand the diversity in the needs and wants of customers 
  • Companies can understand what customers specifically want and when they want it to anticipate their demand 
  • Small data enables companies to understand when and how to enter a specific market 
  • Small data can provide information about inefficiencies in internal processes to improve them 
  • Drastically improve the user and customer experience 

Conclusion 

Small data, big dreams. Even though massive improvements have been made, there are a lot of data challenges. Whether you want to optimise your assets, reduce waste or improve the customer experience, it does not matter if you don’t have thousands and thousands of spreadsheets. 

TUBR can help so send us an email at seethefuture@gettubr.com