Measures of Central Tendency
Measures of central tendency are values that tend to occur at the center... Read More
Data science is an interdisciplinary field that uses developments in computer science, statistics, and other fields to extract information from Big Data or data in general.
Data analysts and scientists in big data analysis use different data management approaches. They consist of capture, curation, storage, search, and transfer.
Visualization involves presenting data graphically for better understanding. With structured data, tables, charts, and trends are common. Unstructured data requires creative approaches like 3D graphics, tag clouds, and mind maps.
Fintech is applied in investment management, including text analytics and natural language processing, risk assessment, and algorithmic trading.
Text analytics uses computer programs to analyze unstructured text or voice data. It works with sources like company filings, reports, earnings calls, and social media content. It can help predict future performance by identifying indicators like consumer sentiment.
Natural language processing (NLP) is an area of study that involves creating computer programs to decipher and analyze human language. Essentially, NLP combines computer science, AI, and linguistics.
Translation, speech recognition, text mining, sentiment analysis, and topic analysis are examples of automated tasks that use NLP. Annual reports, call transcripts, news articles, social media posts, and other text- and audio-based data may all be analyzed using natural language processing (NLP), allowing NLP to discover trends more quickly and accurately than is humanly possible.
With natural language processing data, we can make earnings projections for a company’s short-term future. We can also use Twitter sentiments to assess the success of an initial public offering (IPO).
Commonly used programming languages include Python, R, and Excel VBA. Prominent database systems consist of SQL, SQLite, and NoSQL.
Question
Which of the five data processing methods refers to the process of ensuring data quality and accuracy through a data cleaning exercise?
- Data search.
- Data storage.
- Data curation
The correct answer is C.
Data curation refers to the process of ensuring data quality and accuracy through a data cleaning exercise. It involves uncovering data errors and adjusting for missing data.
A is incorrect. Data search refers to how to query data. Big data requires advanced techniques to locate requested data content.
B is incorrect. Data storage refers to how the data will be recorded, archived, and accessed and the underlying database design.