{"id":12189,"date":"2021-03-08T03:01:08","date_gmt":"2021-03-08T03:01:08","guid":{"rendered":"https:\/\/analystprep.com\/study-notes\/?p=12189"},"modified":"2026-01-30T18:01:42","modified_gmt":"2026-01-30T18:01:42","slug":"data-exploration","status":"publish","type":"post","link":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/","title":{"rendered":"Data Exploration"},"content":{"rendered":"<p><script type=\"application\/ld+json\">\n{\n  \"@context\": \"https:\/\/schema.org\",\n  \"@type\": \"QAPage\",\n  \"mainEntity\": {\n    \"@type\": \"Question\",\n    \"name\": \"Which tokens are most likely noise features when using term frequency for feature selection?\",\n    \"text\": \"Amelia Parker, a junior analyst at ABC Investment Ltd., is improving a machine learning model by incorporating finance-related text data from news articles and tweets. During exploratory data analysis, she visualizes the most informative words based on their term frequency (TF) values. Concerned that some tokens may be noise features for model training, she wants to remove them. To address this, which tokens is she most likely to focus on?\\n\\nA. Low chi-square statistics.\\n\\nB. Very low and very high term frequency (TF) values.\\n\\nC. Low mutual information (MI) value.\",\n    \"answerCount\": 1,\n    \"acceptedAnswer\": {\n      \"@type\": \"Answer\",\n      \"text\": \"The correct answer is B. Tokens with very low and very high term frequency values are commonly considered noise features and are filtered out during vocabulary pruning. Very frequent tokens, such as stopwords, appear across most documents and provide little discriminatory power, while very rare tokens occur in only a few documents and may lead to overfitting. Chi-square statistics and mutual information are instead used to identify informative features with strong discriminatory ability.\"\n    }\n  }\n}\n<\/script><br \/>\n<script type=\"application\/ld+json\">\n{\n  \"@context\": \"https:\/\/schema.org\",\n  \"@type\": \"VideoObject\",\n  \"name\": \"Big Data Projects (2025 Level II CFA\u00ae Exam \u2013 Quantitative Methods \u2013 Module 7)\",\n  \"description\": \"Professor James Forjan, PhD, CFA, explains Big Data Projects for the CFA Level II Quantitative Methods curriculum. Learn how to design, manage, clean, and analyze structured and unstructured data for financial forecasting, and how big data techniques and machine learning are applied in finance. The lesson covers the full workflow, from defining data analysis projects to model training, tuning, and evaluation.\",\n  \"uploadDate\": \"2021-12-26\",\n  \"thumbnailUrl\": \"https:\/\/img.youtube.com\/vi\/ifHmwpgHWYY\/maxresdefault.jpg\",\n  \"contentUrl\": \"https:\/\/www.youtube.com\/watch?v=ifHmwpgHWYY\",\n  \"embedUrl\": \"https:\/\/www.youtube.com\/embed\/ifHmwpgHWYY\",\n  \"duration\": \"PT57M47S\"\n}\n<\/script><br \/>\n<iframe loading=\"lazy\" width=\"560\" height=\"315\" src=\"https:\/\/www.youtube.com\/embed\/ifHmwpgHWYY?si=N8ozH5UbiingANwM\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<p>The main objective of data exploration is to investigate and comprehend data distributions and relationships. Data exploration involves three critical tasks: exploratory data analysis, feature selection, and feature engineering.<\/p>\n<h2>Exploratory Data Analysis<\/h2>\n<p>Exploratory Data Analysis (EDA) is the first step in exploring data. The data is summarized and observed using exploratory graphs, charts, and other visualizations. The main objective of EDA is to act as a communication medium among project stakeholders and analysts. Additionally, EDA is intended to aid in: understanding data properties, finding patterns and relationships in data, inspecting basic questions and hypotheses, documenting data distributions, and planning modeling strategies for the next steps.<\/p>\n<h3 data-tadv-p=\"keep\">Definitions<\/h3>\n<p><em><strong>Feature selection<\/strong><\/em> is a process of selecting only features that contribute most to the prediction variable or output from the dataset for ML model training. Selecting fewer features decreases ML model complexity and training time.<\/p>\n<p><em><strong>Feature engineering<\/strong><\/em> is a process of generating new features by transforming existing features. Model performance depends heavily on feature selection and engineering.<\/p>\n<h2>Structured Data<\/h2>\n<h3>1. Exploratory Data Analysis<\/h3>\n<p>For structured data, EDA can either be performed on one-dimension (a single feature) or multi-dimension (multiple features). Whereas histograms, bar charts, bar plots, and density plots are one dimension, visualizations, line graphs, and scatterplots visualize multi-dimension data. Additionally, descriptive statistics such as central tendency measures, minimum and maximum values for continuous data are useful to summarize data. Counts and frequencies for categorical data can be used to gain insight into the distribution of possible values.<\/p>\n<h3>2. Feature Selection<\/h3>\n<p>Features of structured data are represented by different columns of data in a table or matrix. The objective of the feature selection process is to assist in identifying significant features that, when used in a model, retain the essential patterns and complexities of the larger dataset. Further, these features should require less data overall.<\/p>\n<p>Feature selection methods are utilized to rank all features. If the target variable of interest is discrete, such techniques as chi-square test, correlation coefficient, and information gain would be applicable. These are univariate techniques that score feature variables individually.<\/p>\n<p>Both dimensionality reduction and feature selection seek to reduce the number of features in a data set. The dimensionality reduction method generates new combinations of features that do not correlate. However, feature selection includes and excludes features present in the data without altering them.<\/p>\n<h3>3. Feature Engineering<\/h3>\n<p>Feature engineering is the process of optimizing and improving the features of the data further. Feature engineering techniques methodically modify, decompose, or combine existing features to produce more significant features. More essential features allow an ML model to train more rapidly and efficiently. This process depends on the context of the project, domain of the data, and nature of the problem. An existing feature can be engineered to a new feature or decomposed to multiple features.<\/p>\n<p>Categorical variables can be converted into a binary form (0 or 1) for machine-reading, a process called one-hot encoding. For example, if a single categorical feature represents gender identities with four possible values\u2014male, female, transgender, gender-neutral\u2014then these values can be decomposed into four new features, one for each possible value (e.g., is_male, is_female) filled with 0s (for false) and 1s (for true).<\/p>\n<h2>Unstructured Data<\/h2>\n<h3>1. Exploratory Data Analysis<\/h3>\n<p>Text data incorporates a collection of texts (also known as a corpus) that are sequences of tokens. It is useful to perform EDA of text data by calculating basic text statistics on the tokens. These may include term frequency (TF), which is the ratio of the number of times a given token occurs in all the texts in the dataset to the total number of tokens. Some examples of basic text statistics include word associations, average word and sentence length, and word and syllable counts. These statistics reveal patterns in the co-occurrence of words.<\/p>\n<p>Text modeling involves identifying the words which are most informative in a text by computing the term frequency of each word. The words with high term frequency values are removed since they are likely to be the stop words, making the resulting bag-of-words more compact. The chi-square measure of word association applies to sentiment analysis and text classification application to aid in understanding the significant word appearances in negative and positive sentences in the text.<\/p>\n<p>Similar to structured data, bar charts, and word clouds can be used to visualize text data. Word clouds can be made to visualize the most informative words and their term frequency values. Varying font sizes can show the most commonly occurring words. Further, color is used to add more dimensions, such as frequency and length of words.<\/p>\n<h3>2. Feature Selection<\/h3>\n<p>Feature selection for text data entails selecting a subset of tokens in a data set to effectively reduce the bag-of-words size, making the ML model more efficient and less complicated. Feature selection eliminates noisy features from the dataset. The popular feature selection methods in text data are as follows:<\/p>\n<ul>\n<li><strong><em>Frequency<\/em><\/strong> measures are used for vocabulary pruning to eliminate noise features. The tokens with very high and low term frequency values are filtered across all the texts. Noise features can be stopwords that usually occur repeatedly in all the texts across the dataset. On the other end, noise features can be rare terms that are present in only a few text files. <strong>Document frequency (DF) <\/strong>is equivalent to the number of documents (texts) that contain the respective token divided by the total number of documents. It helps to discard the noise features that carry no specific information about the text class and are present across all texts.<\/li>\n<li><em><strong>Chi-square test<\/strong> <\/em>tests the independence of two events: occurrence of the token and occurrence of the class. Tokens with the highest chi-square test statistic values are selected as features for ML model training because of their higher discriminatory potential.<\/li>\n<li><strong><em>Mutual information (MI)<\/em><\/strong> gauges the amount of information contributed by a token to a class of texts. The mutual information value is equivalent to 0 if the token\u2019s distribution in all text classes is the same. Otherwise, the MI value approaches 1 as the token in any one class tends to occur more frequently in only that particular class of text.<\/li>\n<\/ul>\n<h3>3. Feature Engineering<\/h3>\n<p>Similar to structured data, financial engineering is a fundamental step that dramatically improves ML model training. Some techniques for feature engineering include:<\/p>\n<ul>\n<li><strong><em>Numbers:<\/em> <\/strong>Different numbers are converted into different tokens. For example, 5-digit numbers can be replaced with \u201c\/number5\/,\u201d 10-digit numbers with \u201c\/number10\/,\u201d and so forth.<\/li>\n<li><strong><em>N-<\/em>grams:<\/strong> These refer to <em>n<\/em> consecutive words. Multi-word patterns that are discriminative can be identified, and their association kept intact. For example, when referring to \u201cstock market,\u201d which refers to an economic context, a bigram would be applicable as it treats the two adjacent words as a single token, i.e., stock_market.<\/li>\n<li><strong><em>Name entity recognition (NER)<\/em>:<\/strong> This is an algorithm that takes individual tokens as inputs and pinpoints the relevant nouns such as a person, location, and organization. For example, the NER tag for the tokens, \u201cCFA,\u201d and \u201cinstitute\u201d is an \u201cORGANIZATION.\u201d<\/li>\n<li><strong><em>Parts of speech (POS<\/em>):<\/strong> Similar to NER, parts of speech uses language structure and dictionaries to tag every token in the text with a corresponding part of speech. For example, the POS tag for both of the tokens, \u201cCFA,\u201d and \u201cinstitute\u201d is \u201cNNP,\u201d which refers to a proper noun. POS tags can be useful for separating verbs and nouns for text analytics.<\/li>\n<\/ul>\n<p>Note that the fundamental objective of feature engineering maintaining the semantic essence of the text while simplifying and converting it into structured data for ML.<\/p>\n<blockquote>\n<h2>Question<\/h2>\n<p>Amelia Parker is a junior analyst at ABC Investment Ltd. Parker is building an ML model that has an improved predictive power. She plans to improve the existing model that purely relies on structured financial data by incorporating finance-related text data derived from news articles and tweets relating to the company.<\/p>\n<p>After preparing and wrangling the raw text data, Parker performs exploratory data analysis. She creates and analyzes a visualization that shows the most informative words in the dataset based on their term frequency (TF) values to assist in feature selection. However, she is concerned that some tokens are noise features for ML model training; therefore, she wants to remove them.<\/p>\n<p>To address her concern in the exploratory data analysis, Parker is most likely to focus on those tokens that have:<\/p>\n<p>\u00a0 \u00a0 \u00a0A. Low chi-square statistics.<\/p>\n<p>\u00a0 \u00a0 \u00a0B. Very low and very high term frequency (TF) values.<\/p>\n<p>\u00a0 \u00a0 \u00a0C. Low mutual information (ML) value.<\/p>\n<h3>Solution<\/h3>\n<p><em><strong>The correct is B.<\/strong><\/em><\/p>\n<p>Frequency measures are used for vocabulary pruning to eliminate noise features. The tokens with very high and low TF values are filtered across all the texts. Noise features are both the most recurrent and most rare tokens in the dataset. On one end, noise features can be stopwords that are typically present frequently in all the texts across the dataset.<\/p>\n<p>On the other end, noise features can be sparse terms that are present in only a few text files. Recurring tokens strain the ML model to choose a decision boundary among the texts during text classification as the terms are present across all the texts. The sparse tokens mislead the ML model into classifying texts containing the rare terms into a specific class, a case of overfitting. Thus, pinpointing and eliminating noise features are essential steps for feature selection procedures.<\/p>\n<p><em><strong>A is incorrect. <\/strong><\/em>Chi-square test tests the independence of two events: occurrence of the token and occurrence of the class. Tokens with the highest chi-square test statistic values are selected as features for ML model training because of their due to higher discriminatory potential.<\/p>\n<p><em><strong>C is incorrect.<\/strong><\/em>\u00a0Mutual information (MI) gauges the amount of information contributed by a token to a class of texts. The mutual information value is equivalent to 0 if the token\u2019s distribution in all text classes is the same. Otherwise, the MI value approaches 1 as the token in any one class tends to occur more frequently in only that particular class of text.<\/p>\n<\/blockquote>\n<p>Reading 7: Big Data Projects<\/p>\n<p><em>LOS 7 (c) Describe objectives, methods, and examples of data exploration<\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The main objective of data exploration is to investigate and comprehend data distributions and relationships. Data exploration involves three critical tasks: exploratory data analysis, feature selection, and feature engineering. Exploratory Data Analysis Exploratory Data Analysis (EDA) is the first step&#8230;<\/p>\n","protected":false},"author":5,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[102,229],"tags":[216,264,230],"class_list":["post-12189","post","type-post","status-publish","format-standard","hentry","category-cfa-level-2","category-quantitative-method","tag-cfa-level-2","tag-data-exploration","tag-quantitative-method","blog-post","no-post-thumbnail","animate"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Data Exploration in Quantitative Analysis | CFA Level II<\/title>\n<meta name=\"description\" content=\"Learn data exploration techniques in CFA Level II, including feature exploration, summary statistics, and visualization methods used to analyze data.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Data Exploration in Quantitative Analysis | CFA Level II\" \/>\n<meta property=\"og:description\" content=\"Learn data exploration techniques in CFA Level II, including feature exploration, summary statistics, and visualization methods used to analyze data.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/\" \/>\n<meta property=\"og:site_name\" content=\"CFA, FRM, and Actuarial Exams Study Notes\" \/>\n<meta property=\"article:published_time\" content=\"2021-03-08T03:01:08+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-01-30T18:01:42+00:00\" \/>\n<meta name=\"author\" content=\"Irene R\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Irene R\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/cfa-level-2\\\/quantitative-method\\\/data-exploration\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/cfa-level-2\\\/quantitative-method\\\/data-exploration\\\/\"},\"author\":{\"name\":\"Irene R\",\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/#\\\/schema\\\/person\\\/7002f30d8f174958802c1c30b167eaf5\"},\"headline\":\"Data Exploration\",\"datePublished\":\"2021-03-08T03:01:08+00:00\",\"dateModified\":\"2026-01-30T18:01:42+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/cfa-level-2\\\/quantitative-method\\\/data-exploration\\\/\"},\"wordCount\":1661,\"keywords\":[\"CFA-level-2\",\"Data Exploration\",\"Quantitative Method\"],\"articleSection\":[\"CFA Level II Study Notes\",\"Quantitative Method\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/cfa-level-2\\\/quantitative-method\\\/data-exploration\\\/\",\"url\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/cfa-level-2\\\/quantitative-method\\\/data-exploration\\\/\",\"name\":\"Data Exploration in Quantitative Analysis | CFA Level II\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/#website\"},\"datePublished\":\"2021-03-08T03:01:08+00:00\",\"dateModified\":\"2026-01-30T18:01:42+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/#\\\/schema\\\/person\\\/7002f30d8f174958802c1c30b167eaf5\"},\"description\":\"Learn data exploration techniques in CFA Level II, including feature exploration, summary statistics, and visualization methods used to analyze data.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/cfa-level-2\\\/quantitative-method\\\/data-exploration\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/cfa-level-2\\\/quantitative-method\\\/data-exploration\\\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/cfa-level-2\\\/quantitative-method\\\/data-exploration\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Data Exploration\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/#website\",\"url\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/\",\"name\":\"CFA, FRM, and Actuarial Exams Study Notes\",\"description\":\"Question Bank and Study Notes for the CFA, FRM, and Actuarial exams\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/#\\\/schema\\\/person\\\/7002f30d8f174958802c1c30b167eaf5\",\"name\":\"Irene R\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/33caf1e1bcb63ee970b36351f165c7bc714b19614993ab9c2c8bf36273b7df48?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/33caf1e1bcb63ee970b36351f165c7bc714b19614993ab9c2c8bf36273b7df48?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/33caf1e1bcb63ee970b36351f165c7bc714b19614993ab9c2c8bf36273b7df48?s=96&d=mm&r=g\",\"caption\":\"Irene R\"},\"url\":\"https:\\\/\\\/analystprep.com\\\/study-notes\\\/author\\\/irene\\\/\"}]}<\/script>\n<meta property=\"og:video\" content=\"https:\/\/www.youtube.com\/embed\/ifHmwpgHWYY\" \/>\n<meta property=\"og:video:type\" content=\"text\/html\" \/>\n<meta property=\"og:video:duration\" content=\"3468\" \/>\n<meta property=\"og:video:width\" content=\"480\" \/>\n<meta property=\"og:video:height\" content=\"270\" \/>\n<meta property=\"ya:ovs:adult\" content=\"false\" \/>\n<meta property=\"ya:ovs:upload_date\" content=\"2021-03-08T03:01:08+00:00\" \/>\n<meta property=\"ya:ovs:allow_embed\" content=\"true\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Data Exploration in Quantitative Analysis | CFA Level II","description":"Learn data exploration techniques in CFA Level II, including feature exploration, summary statistics, and visualization methods used to analyze data.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/","og_locale":"en_US","og_type":"article","og_title":"Data Exploration in Quantitative Analysis | CFA Level II","og_description":"Learn data exploration techniques in CFA Level II, including feature exploration, summary statistics, and visualization methods used to analyze data.","og_url":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/","og_site_name":"CFA, FRM, and Actuarial Exams Study Notes","article_published_time":"2021-03-08T03:01:08+00:00","article_modified_time":"2026-01-30T18:01:42+00:00","author":"Irene R","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Irene R","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/#article","isPartOf":{"@id":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/"},"author":{"name":"Irene R","@id":"https:\/\/analystprep.com\/study-notes\/#\/schema\/person\/7002f30d8f174958802c1c30b167eaf5"},"headline":"Data Exploration","datePublished":"2021-03-08T03:01:08+00:00","dateModified":"2026-01-30T18:01:42+00:00","mainEntityOfPage":{"@id":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/"},"wordCount":1661,"keywords":["CFA-level-2","Data Exploration","Quantitative Method"],"articleSection":["CFA Level II Study Notes","Quantitative Method"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/","url":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/","name":"Data Exploration in Quantitative Analysis | CFA Level II","isPartOf":{"@id":"https:\/\/analystprep.com\/study-notes\/#website"},"datePublished":"2021-03-08T03:01:08+00:00","dateModified":"2026-01-30T18:01:42+00:00","author":{"@id":"https:\/\/analystprep.com\/study-notes\/#\/schema\/person\/7002f30d8f174958802c1c30b167eaf5"},"description":"Learn data exploration techniques in CFA Level II, including feature exploration, summary statistics, and visualization methods used to analyze data.","breadcrumb":{"@id":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/analystprep.com\/study-notes\/cfa-level-2\/quantitative-method\/data-exploration\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/analystprep.com\/study-notes\/"},{"@type":"ListItem","position":2,"name":"Data Exploration"}]},{"@type":"WebSite","@id":"https:\/\/analystprep.com\/study-notes\/#website","url":"https:\/\/analystprep.com\/study-notes\/","name":"CFA, FRM, and Actuarial Exams Study Notes","description":"Question Bank and Study Notes for the CFA, FRM, and Actuarial exams","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/analystprep.com\/study-notes\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/analystprep.com\/study-notes\/#\/schema\/person\/7002f30d8f174958802c1c30b167eaf5","name":"Irene R","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/33caf1e1bcb63ee970b36351f165c7bc714b19614993ab9c2c8bf36273b7df48?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/33caf1e1bcb63ee970b36351f165c7bc714b19614993ab9c2c8bf36273b7df48?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/33caf1e1bcb63ee970b36351f165c7bc714b19614993ab9c2c8bf36273b7df48?s=96&d=mm&r=g","caption":"Irene R"},"url":"https:\/\/analystprep.com\/study-notes\/author\/irene\/"}]},"og_video":"https:\/\/www.youtube.com\/embed\/ifHmwpgHWYY","og_video_type":"text\/html","og_video_duration":"3468","og_video_width":"480","og_video_height":"270","ya_ovs_adult":"false","ya_ovs_upload_date":"2021-03-08T03:01:08+00:00","ya_ovs_allow_embed":"true"},"_links":{"self":[{"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/posts\/12189","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/comments?post=12189"}],"version-history":[{"count":16,"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/posts\/12189\/revisions"}],"predecessor-version":[{"id":42255,"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/posts\/12189\/revisions\/42255"}],"wp:attachment":[{"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/media?parent=12189"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/categories?post=12189"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/analystprep.com\/study-notes\/wp-json\/wp\/v2\/tags?post=12189"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}