Skip to content

stordco/analytics_challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

STORD Data Analytics Exercise

The goal of this exercise is to test your ability to tease out non obvious insights from data. We are looking for candidates with sound statistics knowledge and strong analytical capabilities.

The dataset provided to you contains car sales listings and descriptions from Craigslist. Your task is to dig into this dataset and come up with the following aggregations:

  • insightful descriptive statistics that indicate a trend in the data (you can combine columns or use a subset of columns, entirely upto you. Try and be as creative as possible and extract as many non-obvious metrics as you can)
  • groups for your data such that all rows in your data can be easily categorized into groups with similar features. Make sure to list out the features for each group.
  • Optional: Develop a model that predicts the price listing of a car depending on its features.
  • HINT: Make sure you identify and remove outliers in the dataset that may indicate data corruption or skewed results on analysis.
  • HINT: Feel free to choose a subset or combination of columns for grouping the data and in your prediction model

The Data

We will share the sample dataset ahead of the exercise.

Data fields

  • id - entry ID
  • url - listing URL
  • region - craigslist region
  • region_url - region URL
  • price - entry price
  • year - entry year
  • manufacturer - manufacturer of vehicle
  • model - model of vehicle
  • condition - condition of vehicle
  • cylinders - number of cylinders
  • fuel - fuel type
  • odometer - miles traveled by vehicle
  • title_status - title status of vehicle
  • transmission - transmission of vehicle
  • vin - vehicle identification number
  • drive - type of drive
  • size - size of vehicle
  • type - generic type of vehicle
  • paint_color - color of vehicle
  • image_url - image URL
  • description - listed description of vehicle
  • county - county
  • state - state of listing
  • lat - latitude of listing
  • long - longitude of listing

Deliverable

  • Implement your solution as a script in python (Ipython notebooks are preferred)
  • Email the point of contact that sent you this exercise.
    • Include your Ipython notebook with all relevant statistical graphs
    • Make sure to label all graphs and add comments where needed
  • Be prepared to walk through your solution during the on-site interview.

Evaluation

Your submission will be evaluated along the following criteria by the Reviewer

  • Completeness - Does your submission meet the Deliverables specified above?
  • Business Acumen - How easy was it for a business stakeholder (with limited technical knowledge) to understand your solution?
  • Storylining - Evaluate the flow of your solution
  • Creativity - Evaluate your ability to tease out non-obvious insights from the data
  • Technical Capability - How statistically reliable are your insights?
  • Critical Thinking - Please be ready to speak to your thinking proccess for the solution here.

Notes:

  • The good news is that we will not subject you to a code exercise on a whiteboard when you are on-site!
  • Please feel free to ask clarifying questions via email!
  • Thank you for the time you are spending as a candidate with STORD!

About

Coding assessment for Business Analyst Candidates

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published