CAP-E Exam Prep Free practice test →

Free CAP-E Practice Questions

10 free, exam-style CAP-Essentials (Certified Analytics Professional - Essentials) (CAP-E) (CAP-E) practice questions with answers and explanations. No signup required. Work through them below, then take the full free CAP-E practice test to study every exam domain.

The CAP-E exam has 120 questions and runs 3 hours.

These 10 free CAP-E questions are organized by exam domain, so you can see how each part of the CAP-Essentials (Certified Analytics Professional - Essentials) (CAP-E) blueprint is tested. Reveal the answer and explanation under each question.

Domain 1: Business Problem (Question) Framing 15% of exam

Question 1

A sponsor tells an analytics team, "Build me a dashboard to fix our declining sales." Before doing anything else, the analyst's FIRST step should be to:

  1. Work with the sponsor to define a clear, agreed-upon statement of the business problem
  2. Start pulling historical sales data from the warehouse so the work can begin right away
  3. Select a visualization tool and begin designing the layout of the dashboard
  4. Build a predictive model to forecast where sales are heading over the next quarter
Show answer & explanation

Correct answer: A - Work with the sponsor to define a clear, agreed-upon statement of the business problem

Question 2

In a RACI matrix used to organize stakeholders for an analytics project, the person who is ultimately answerable for the outcome and of whom there should be exactly one is the:

  1. Responsible party
  2. Consulted party
  3. Accountable party
  4. Informed party
Show answer & explanation

Correct answer: C - Accountable party

Domain 2: Analytics Problem Framing 16% of exam

Question 3

A business problem is reformulated as: "Predict which customers will cancel their subscription in the next 30 days." In this analytics problem, the customer's cancellation status is the:

  1. Independent variable
  2. Dependent variable
  3. Constraint
  4. Baseline measure
Show answer & explanation

Correct answer: B - Dependent variable

Question 4

Before deploying a new model intended to improve on-time delivery, a team measures the current on-time rate under the existing process. The PRIMARY reason for capturing this figure is to:

  1. Establish a reference point against which the solution's later improvement can be measured
  2. Satisfy an external regulatory requirement that mandates this kind of documentation
  3. Determine which specific machine learning algorithm the team ought to select
  4. Identify the full set of independent variables that will feed into the model
Show answer & explanation

Correct answer: A - Establish a reference point against which the solution's later improvement can be measured

Question 5

During analytics problem framing, an analyst notes that the model must run on data available by 6:00 a.m. daily and cannot use protected demographic attributes. These are BEST described as the project's:

  1. Measures that define what a successful outcome for the project will look like
  2. Drivers, meaning the candidate input variables thought to influence the outcome
  3. Assumptions and constraints
  4. Deliverables that the team has formally committed to hand over to the sponsor
Show answer & explanation

Correct answer: C - Assumptions and constraints

Domain 3: Data 21% of exam

Question 6

Which of the following is the BEST example of unstructured data?

  1. A relational table of customer IDs and their account balances
  2. A spreadsheet containing daily recorded temperature readings
  3. The free-text body of incoming customer support emails
  4. A relational database of product identifiers and prices
Show answer & explanation

Correct answer: C - The free-text body of incoming customer support emails

Question 7

In the "4 Vs" commonly used to characterize big data, the dimension that refers to the trustworthiness and quality of the data is:

  1. Volume
  2. Velocity
  3. Variety
  4. Veracity
Show answer & explanation

Correct answer: D - Veracity

Question 8

A profiling review finds that the same customer appears as three separate records because of slight spelling differences in the name field. Which data quality dimension is MOST directly violated?

  1. Timeliness
  2. Uniqueness
  3. Validity
  4. Accuracy
Show answer & explanation

Correct answer: B - Uniqueness

Question 9

An analyst records where each dataset originated, every transformation applied to it, and which version was used in the final model. The PRIMARY purpose of maintaining this data lineage is to:

  1. Reduce the ongoing storage cost of the organization's central data warehouse
  2. Increase the raw processing and scoring speed of the deployed model
  3. Enable traceability, reproducibility, and auditing of the results
  4. Automatically raise the predictive accuracy of the model that is built
Show answer & explanation

Correct answer: C - Enable traceability, reproducibility, and auditing of the results

Question 10

A column of adult patient ages contains mostly values between 20 and 90, plus several entries of 999. The value 999 was used by a legacy system to mean "unknown." The MOST appropriate action is to:

  1. Leave the values exactly as-is because they are part of the original source data
  2. Treat the 999 entries as missing values and handle them accordingly
  3. Delete every row that happens to contain any numeric age value at all
  4. Replace every age in the column with the average of 999 and the real ages
Show answer & explanation

Correct answer: B - Treat the 999 entries as missing values and handle them accordingly

The rest of the CAP-E blueprint

The CAP-E exam also covers these domains. Drill them in the full free practice test:

Ready for the real thing?

Practice hundreds more CAP-E questions with instant scoring, weak-area drills, and full exam simulations.

Start the free practice test See pricing