29. April 2018 22:50
Data has a huge potential if the right value is derived from it. Big data is the voluminous repository of data which is used to form inferences. It is a relatively new concept.
It is a wide field and encompasses a number of terms in itself. Here we have listed certain terms which are used in tandem with big data that one must know to gain a detailed insight over this subject of study.
- Algorithm – It refers to a mathematical formula that is generally used to analyze data. It is usually run by a software.
- Data Lake – Huge source of information in its raw form is known as Data Lake. This information is used to form inferences.
- Data Mining – The process of delving into available information to derive meaningful insights is known as data mining. Experts use a variety of statistical techniques, machine learning algorithms, sophisticated software’s meant for data analysis to generate significant conclusions.
- Distributed File System – It is not possible to store large volume of data on a single system as it is complex, costly and infeasible. So, it is generally stored across various storage devices. This mechanism of storing data on multiple devices is known as distributed file system.
- ETL – It is an acronym which means extract, transform & load. It is a three step process which is used while ingesting information over big data systems in a structured manner. First of all, raw data is extracted, then it is converted into a meaningful form. Finally, it is loaded onto the system for use.
- Hadoop – It is a programming framework. Hadoop enables storage and processing of huge chunks of data.
- Data Scientist – They are people who are experts in analyzing data and are well versed with science, statistics, mathematics and other data analysis techniques.
- Data Cleansing - There might be certain sets of information which are incorrect. Data cleaning is the process of removing incorrect or irrelevant information from the database. This is done to maintain the quality of data as it is further used for analysis.
- Dark Data – This is that kind of data which is there in the company’s data repository but has never been put to use.
- NoSQL- It is a database system which means ‘Not Only SQL’. It is designed to handle large volumes of unstructured data without any schema.
- IoT – Internet of Things is the growing trend these days. Since so many devices are connected to the network, large amount of data is collected through them which can be used for further analysis.
To know more about big data, contact Centex Technologies at (972) 375 - 9654.