Curriculum version: 1.0.0
(see CHANGELOG)
Python is an extremely popular programming language for machine learning and data engineering. The provided courses will give you foundational knowledge to be productive in data engineering platforms and tools like Palantir Foundry, Databricks, and Spark.
Topics covered:
Courses | Duration | Effort | Prerequisites | Discussion |
---|---|---|---|---|
Introduction To Python Scripting | 5 hours | 5 hours/week | Core Curriculum | chat |
Introduction To Python Development | 20 hours | 20 hours/week | Core Curriculum, Introduction To Python Scripting | chat |
Python Use Cases | 5 hours | 5 hours/week | Core Curriculum, Introduction To Python Development | chat |
Python Certification Course | 20 hours | 20 hours/week | Core Curriculum, All courses above | chat |
Topics covered:
- Knowledge of all the essential SQL commands
- Become competent in using sorting and filtering commands in SQL
- Enhance the performance of your Database by using Views and Indexes
- Become proficient in SQL tools like GROUP BY, JOINS and Subqueries
- Master SQL's most popular string, mathematical and date-time functions
- Increase your efficiency by learning the best practices while writing SQL queries
Courses | Duration | Effort | Prerequisites | Discussion |
---|---|---|---|---|
SQL Basics | 20 hours | 20 hours/week | Core Curriculum | chat |
SQL Masterclass | 40 hours | 20 hours/week | Core Curriculum, SQL Basics | chat |
Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Spark is a key component of platforms such as Palantir Foundry and Databricks, making it the cornerstone of the data engineering stack.
Topics covered:
- Apply Spark programming basics, including parallel programming basics forDataFrames, data sets, and Spark SQL.
- Hadoop Ecosystem, Core Components and HDFS
- YARN and its Advantage
- Hadoop Cluster and its Architecture
- Big Data Analytics with Batch & Real-Time Processing
- PySpark
Courses | Duration | Effort | Prerequisites | Discussion |
---|---|---|---|---|
PySpark Certification Training | 120 hours | 20 hours/week | Python Certification, SQL Certification | chat |
This MicroMasters program in Statistics and Data Science is comprised of four online courses and a virtually proctored exam that will provide you with the foundational knowledge essential to understanding the methods and tools used in data science, and hands-on training in data analysis and machine learning. You will dive into the fundamentals of probability and statistics, as well as learn, implement, and experiment with data analysis techniques and machine learning algorithms. This program will prepare you to become an informed and effective practitioner of data science who adds value to an organization. The program certificate can be applied, for admitted students, towards a PhD in Social and Engineering Systems (SES) through the MIT Institute for Data, Systems, and Society (IDSS) or may accelerate your path towards a Master’s degree at other universities around the world.
Topics covered:
- Master the foundations of data science, statistics, and machine learning
- Analyze big data and make data-driven predictions through probabilistic modeling and statistical inference; identify and deploy appropriate modeling and methodologies in order to extract meaningful information for decision making
- Develop and build machine learning algorithms to extract meaningful information from seemingly unstructured data; learn popular unsupervised learning methods, including clustering methodologies and supervised methods such as deep neural networks
- Finishing this MicroMasters program will prepare you for job titles such as: Data Scientist, Data Analyst, Business Intelligence Analyst, Systems Analyst, Data Engineer
Courses | Duration | Effort | Prerequisites | Discussion |
---|---|---|---|---|
Mitx Statistics & Data Science | 756 hours | 14 hours/week | Core Curriculum, Python Certification, College level calculus | chat |
Gain expertise in the growing field of Supply Chain Management through an innovative online program consisting of five courses and a final capstone exam. The MicroMasters Program in Supply Chain from MITx is an advanced, professional, graduate-level foundation in Supply Chain Management.
Topics covered:
- To apply core methodologies (probability, statistics, optimization) used in supply chain modeling and analysis.
- To understand and use fundamental models to make trade-offs between forecasting, inventory, and transportation.
- To design supply chain networks as well as financial and information flows.
- To understand how supply chains act as systems and interact.
- How technology is used within supply chains from fundamentals to packaged software systems.
- End to end supply chain management.
Courses | Duration | Effort | Prerequisites | Discussion |
---|---|---|---|---|
Mitx Supply Chain Management | 936 hours | 14 hours/week | Core Curriculum, Python Certification, College level calculus | chat |
Once you have developed skills you will need a way to gain experience. There are three good ways to gain the experience to get you hired.
- Stack overflow. This is a website that helps software engineers find answers to bugs and common problems. If you build up a high rank on stack overflow it will help get your foot in the door.
- Hacker Rank. Hacker Rank lets developers solve challenges to earn points. Recruiters can then request interviews with top-ranked coders. This is an awesome resource!
- Contribute to at least one popular open source project for six months. Some popular projects are below:
Chat is here. To see the latest trending projects click here.