Why should we understand the importance of big data and data mining and prepare for the future?

In this blog post, we will look at why big data and data mining are important and how their development will affect our future.

 

At some point, we began to encounter the unfamiliar term “big data” in various media. It has only been a few years since this term began to be used widely, but we are already familiar with it. Even expressions such as “marketing utilizing big data” are now commonplace. So what is it about big data and data mining that is attracting so much attention? To answer this question, we first need to understand the concept of big data and its background.
Big data literally means a huge set of data. From simple numbers to complex CCTV images, any data that can be stored on a storage medium can become big data by being collected and aggregated, regardless of its format. The interesting thing here is that, in terms of data format, there is no significant difference between conventional data and big data. However, if big data were simply large data, it would have already been popular in the late 1990s and early 2000s, when computer technology was rapidly developing. So why did big data become a hot topic in the 2010s? This is closely related to the following three important technological developments.
First, the biggest factor is the paradigm shift in CPU development. The CPU (Central Processing Unit) is the brain of a computer that performs calculations, and in the past, it developed so rapidly that Moore’s Law, which states that CPU performance doubles every 18 months, was widely accepted. However, in 2004, CPU development hit a wall known as the “4GHz barrier.” Until then, CPUs had been developed by increasing the number of transistors (computing elements) in a single core (computing unit) to increase the speed of a single computing unit. However, as the integration density of transistors increased, heat generation became a serious problem, and a new approach was needed. Instead of increasing the number of transistors in a core, CPU manufacturers developed multi-core CPUs by putting multiple cores in a single CPU, thereby advancing technology for parallel data processing. This made it possible to process vast amounts of data that had previously been difficult to handle due to limitations in processing speed more quickly and efficiently.
In addition, advances in storage media also played an important role in ushering in the era of big data. Storage media such as hard disk drives (HDDs) have seen dramatic improvements in data storage capacity and speed. In the past, 1 GB was the standard capacity, but now hard disk drives with capacities of 8 TB or more have become commonplace, and the advent of high-speed storage media such as solid state drives (SSDs) has greatly improved the speed at which large amounts of data can be stored and processed. Thanks to these technological advances, large amounts of data that were previously difficult to utilize due to storage limitations can now be handled more easily.
While the advancement of CPUs and storage media has enabled the utilization of big data, changes in the way data is collected have further expanded the scope of big data. The rapid spread of smart devices and social media in the 2010s has changed the paradigm of data collection. Smart devices connected directly or indirectly to networks collect user data through various sensors such as cameras, GPS (Global Positioning System), and NFC (Near Field Communication) installed in them, and upload this data to the network in real time. In addition, users of social media such as Facebook and Twitter voluntarily share their personal information online, which also accounts for a large part of the vast amount of data. In the past, it was common for entities with specific purposes to collect target data, but now, data that is constantly generated and flows through smart devices and social media is collected indiscriminately. With the advancement of network technology, more and more things are becoming connected to the Internet, ushering in the era of the Internet of Things (IoT) and further expanding the scope of data collection.
The combination of the development of multi-core CPUs, advances in storage media, and the expansion of the scope of data collection has given rise to the concept of big data. Currently, numerous companies and government agencies are analyzing the big data they have collected and striving to find meaningful information within it. Various media outlets are constantly emphasizing the importance of big data. However, the most important reason we need to think deeply about big data is that what we are currently experiencing is only the beginning. In the future, multi-core CPUs will evolve to perform faster calculations simultaneously, and storage media will offer larger capacities and faster speeds. Along with this, the amount of data collected will increase exponentially as more and more objects are connected to networks. The big data we know today may be considered small in scale in the true big data era of the future.

 

About the author

Writer

I'm a "Cat Detective" I help reunite lost cats with their families.
I recharge over a cup of café latte, enjoy walking and traveling, and expand my thoughts through writing. By observing the world closely and following my intellectual curiosity as a blog writer, I hope my words can offer help and comfort to others.