Instantly share code, notes, and snippets.

@edownin1

edownin1 / Coursera Capstone Project W3 - Exploratory Data Analysis.ipynb

This course is part of the IBM Data Engineering Professional Certificate

Data Engineering Capstone Project

Image of instructor, Rav Ahuja

Financial aid available

IBM

About this Course

In this course you will apply a variety of data engineering skills and techniques you have learned as part of the previous courses in the IBM Data Engineering Professional Certificate. You will assume the role of a Junior Data Engineer who has recently joined the organization and be presented with a real-world use case that requires a data engineering solution.

Computer and IT literacy.

Could your company benefit from training employees on in-demand skills?

What you will learn

Demonstrate proficiency in skills required for an entry-level data engineering role.

Design and implement various concepts and components in the data engineering lifecycle such as data repositories.

Showcase working knowledge with relational databases, NoSQL data stores, big data engines, data warehouses, and data pipelines.

Apply skills in Linux shell scripting, SQL, and Python programming languages to Data Engineering problems.

Skills you will gain

Instructors

Placeholder

Ramesh Sannareddy

Placeholder

IBM is the global leader in business transformation through an open hybrid cloud platform and AI, serving clients in more than 170 countries around the world. Today 47 of the Fortune 50 Companies rely on the IBM Cloud to run their business, and IBM Watson enterprise AI is hard at work in more than 30,000 engagements. IBM is also one of the worldā€™s most vital corporate research organizations, with 28 consecutive years of patent leadership. Above all, guided by principles for trust and transparency and support for a more inclusive society, IBM is committed to being a responsible technology innovator and a force for good in the world.

For more information about IBM visit: www.ibm.com

See how employees at top companies are mastering in-demand skills

Data Science with R - Capstone Project

Image of instructor, Jeff Grossman

Financial aid available

IBM

About this Course

In this capstone course, you will apply various data science skills and techniques that you have learned as part of the previous courses in the IBM Data Science with R Specialization or IBM Data Analytics with Excel and R Professional Certificate.

For this project, you will assume the role of a Data Scientist who has recently joined an organization and be presented with a challenge that requires data collection, analysis, basic hypothesis testing, visualization, and modeling to be performed on real-world datasets. You will collect and understand data from multiple sources, conduct data wrangling and preparation with Tidyverse, perform exploratory data analysis with SQL, Tidyverse and ggplot2, model data with linear regression, create charts and plots to visualize the data, and build an interactive dashboard. The project will culminate with a presentation of your data analysis report, with an executive summary for the various stakeholders in the organization.

Could your company benefit from training employees on in-demand skills?

What you will learn

Write a web scraping program to extract data from an HTML file using HTTP requests and convert the data to a data frame.

Prepare data for modelling by handling missing values, formatting and normalizing data, binning, and turning categorical values into numeric values.

Interpret datawithexploratory data analysis techniques by calculating descriptive statistics, graphing data, and generating correlation statistics.

Build a Shiny app containing a Leaflet map and an interactive dashboard then create a presentation on the project to share with your peers.

Skills you will gain

Instructors

Placeholder

Jeff Grossman

Placeholder

IBM is the global leader in business transformation through an open hybrid cloud platform and AI, serving clients in more than 170 countries around the world. Today 47 of the Fortune 50 Companies rely on the IBM Cloud to run their business, and IBM Watson enterprise AI is hard at work in more than 30,000 engagements. IBM is also one of the worldā€™s most vital corporate research organizations, with 28 consecutive years of patent leadership. Above all, guided by principles for trust and transparency and support for a more inclusive society, IBM is committed to being a responsible technology innovator and a force for good in the world.

For more information about IBM visit: www.ibm.com

See how employees at top companies are mastering in-demand skills

Syllabus - What you will learn from this course

Module 1 - capstone overview and data collection, module 2 - data wrangling, module 3: performing exploratory data analysis with sql, tidyverse & ggplot2.

At this stage of the Capstone Project, you have gained some valuable working knowledge of data collection and data wrangling. You have also learned a lot about SQL querying and visualization. Congratulations! Now it's time to apply some of your new knowledge and learn about Exploratory Data Analysis (EDA) techniques, again through practice. You can use the datasets you wrangled in the previous Module. However, if you had any issues completing the wrangling, no worries - we have prepared some clean datasets for you to use. You will be asked to complete three labs:

Module 4: Predictive Analysis

Module 5 - building a r shiny dashboard app, module 6 - present your data-driven insights, frequently asked questions.

When will I have access to the lectures and assignments?

Access to lectures and assignments depends on your type of enrollment. If you take a course in audit mode, you will be able to see most course materials for free. To access graded assignments and to earn a Certificate, you will need to purchase the Certificate experience, during or after your audit. If you don't see the audit option:

The course may not offer an audit option. You can try a Free Trial instead, or apply for Financial Aid.

The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

What will I get if I subscribe to this Certificate?

When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

More questions? Visit the Learner Help Center .

Build employee skills, drive business results

Coursera Footer

Learn something new.

Popular Data Science Courses

Popular Computer Science & IT Courses

Popular Business Courses

Placeholder

Learner Reviews & Feedback for IBM Data Analyst Capstone Project by IBM

About the course, top reviews.

Nov 8, 2021

This is a great course. I learned so much about data science. I appreciate all the help I received. I would not have finished the course without the help of my fellow students and the staff. Thank you

Sep 27, 2021

Great. I practiced data visualization on IBM's Cognos Analytics software. It's a great piece of software. I then learned how to make a presentation to present the results of the analysis.

1 - 25 of 128 Reviews for IBM Data Analyst Capstone Project

By Marcos L

Jun 1, 2021

The instructors are pretty boring, however, even though the course has good content, the resources are buggy, there are too many mistakes in the assessments and not even IBM recognizes the course.

You might as well learn via youtube

By Raghvendra K

Dec 28, 2020

I learned so much from this course about Data analysis Process. I gain skills Like, Python Language and Libraries, SQL, Advance excel and Data visualization tools. I got job a junior level data analyst job.

By Liliya T

May 17, 2021

The course is good, thank you. But I have some recommendations to the final assignment (presentation). The task is not clear, it is very difficult to understand what exactly we should do. Do we need to use all data of the Stack Overflowā€™s annual Developer Survey or just only technology trends? And in submission section there are questions and its not clear if we need to put all information from the slides there or just write -"Yes", 'No'. I think authors should write better instruction for assignment. And you should mention that grades depend on the number of findings/conclusions/results in the presentation.

By Donald S

Jun 10, 2021

IBM software bugs out way too often and renders some of the course material impossible to complete until their system repairs itself.

By Keynon K

Mar 25, 2021

This course was filled with things we did not previously cover in other modules, and then had very little or no correlation to the final capstone. The final capstone project assignment itself was great, but the labs and quizzes leading up to it felt like a large waste of time.

Apr 7, 2021

The skills and tools taught are interesting and useful, and it's rewarding to build your own project within the guiding structure of the course. That said, the curriculum was often vague, requiring considerable time just to divine what was being asked for. Support from the teaching staff in the forums seemed inconsistent, with many questions receiving unhelpful responses and some simply ignored. Of course, struggling with the material pushes you to develop a deeper understanding of it, and the open-endedness is good practice for real-world problems. Ultimately, I was able to figure it out, but I think a few corrections and clarifications would improve the curriculum.

By Chukwunonso K

Mar 8, 2021

This course was exceptional and I was able to complete it and obtain my professional IBM Data Analyst Certificate ..

Kudos to all tutors who have helped in making me understand more and more concept around Data....

By Sherita K

Jan 13, 2023

Overall, I found the course to be great. The Capstone project itself was a little frustrating. We were expected to know subjects we hadn't covered yet, the labs were riddled with errors, and some of the source material is outdated. Please take a look into this.

Dec 12, 2022

I give 1 star to this course as it needs improvements. The project is rather chaotic and there are many errors in it. It would be good to work with the same dataset/file throughout the entire course. Big drawback is that the free plan in IBM Cloud is limited. In my case, it came to an end at the very beginning of this project. Thus I had to find and use another tool for Jupiter Notebooks and Python. In the end I didn't manage to do all tasks with CDE as I simply didn't have it even though I created a new account to use IMB Watson Studio.

By Victor N

Nov 25, 2022

Very poor course, the software is not working like Cognos, always mistakes

By MAURICIO C

Dec 12, 2020

This was not an easy learning, the student must been do a hard work and high comprehension about data contains, its normalization, determinate the objetives of value , aplicar filtros, and do all the necessary in order to obtain to present outstanding results through work of final presentation. I feel that in this training I have achieved a great goal with this learning.

By Muhammad S

May 24, 2021

Overall this bundle of courses is one of the best to start learning data analysis, it's well structured and designed. A lot of self-learning too and not everything is given to you on a platter which is good. All components of this course together can give you sufficient skills to start an entry-level role or your own project

Feb 12, 2021

For me, this capstone project taught me and understand the whole process from the beginning until the ending process of data analysis with another previous courses. Now I know what I need to keep working on and improving.

By Robert S

Apr 26, 2022

First five weeks are great, a good hands-on exercise on everything except the process of researching (since we're handed all necessary data in links throughout the course), which is often outside our purview anyway. One notable exception is the process of saving CSV and XLSX files in the week 1 labs on APIs and webscraping (the Github job and salary data we need to use in the final Powerpoint). A tutorial lab or section DESPERATELY needs to be added teaching us how to access these files after saving them -- the process involves setting up project access tokens and a cloud storage object, then importing a special IBM cloud library allowing one to save the files as accessible data assets, but this information is difficult to find online. It should really be laid out in a lab, both for the final project and to let us test the files we made at the time we make them.

Week 6, the Powerpoint, isn't great. There's been a lot of technical education on the processes of data cleaning and everything of that nature across these nine courses, but very little in terms of what conclusions one might draw from this data. I think the certificate could use a course on this sort of actual analysis featuring examples of some of the conclusions we might draw from other datasets, or even a couple freebies at the start of this project just to point people in the right direction. When it came to coming up with innovative ideas about the data, I found myself coming up with a bunch and wondering if they were "innovative" enough; like whether simply laying out the rankings on a particular bar chart was sufficient for a "finding" point, or whether future predictions about rankings were sufficient for an implication. Grade-wise I did fine, I just think a bit of practice earlier would help us not be stumbling around so much on the final presentation.

Jun 10, 2022

Disappointed in this course, could be much better if more work is done. IBM software and oversights in design were my biggest issues. Some parts are intuitive and well paced, while others are difficult for a person without programming experience. I expected better from a course with IBM's name on it.

In this last module, I completed some Jupyter exercises and when I took the test for the section I had to return to the Jupyter lab project and add in more code to answer additional questions. Just have me do those exercises in the first place, then I can answer all the questions and it won't waste my time going back and forth. The IBM software ran out of processing time on the trial when I was busy with the Python exercises, then you have to wait a full month for it to reset so I ended up having to use Jupyter running on my computer to finish before the module was due.

I learned a lot of useul skills and content was great, but I think there are things that could be done to make it a more smooth and enjoyable experience.

Dec 22, 2021

I love completing visualisations of data. It's what attracted me to data science as a career swap. However, this course (it means well) is broken in a lot of places, which resulted in me spending as much time on the forums in the Capstone as I did on the course material. I would take 4 weeks of interpreting visualizations before another "course unavailable" or "Watson environment unresponsive" challenge. I appreciate everyone's efforts and input in to this course, thank you. I would not recommend it unless you have extra waiting time and a solid background in Python, Cognos and APIs.

Nov 9, 2021

Nov 19, 2021

This specialization was on the top of my all courses. So many different choices for trying to learn data analysis: sql, python, excel, etc. Everything thoroughly explained. Just thank you again.

By Muktar H

Jul 13, 2021

I love its video and reading materials, thank you all for making such course, i can definitely make a career out of it, i just recommend you all to give you time in this course.

By Hichem D

Dec 27, 2020

this course was a very good opportunity to practice almost all the materials studied in this specialization.

taking the role of an associative data analyst was very helping

By Ricardo S

Jan 5, 2021

Excellent Course, each lesson has good pace and a good ration of theory and practice

By Oluwabukunmi F

Dec 13, 2022

Totally worth it and very detailed

Dec 17, 2022

The first 5/6 of the course built on, and tested knowledge gained over the previous 8 courses.

Week 6 was like hearing "And now for something completely different...!"

There needs to be a more gradual, and complete "learning how to write a report" section.

By Anthony G

Aug 28, 2022

This was informative, but there are many lab issues that are over a year old, which many participants have naturally come across without a solution and had to make due with subpar submissions, or lost plenty of sleep, time, and health to complete. Please incorporate a list of tips at the end of each lab that handle the typical pitfalls, so the forums can be left for really unconventional problems. The "Collecting Job data from API" lab does not allow importing of a certain library for creating an excel worksheet.

Sep 25, 2022

Complex course to understand at the beginning because some links do not work so you should go to the forums to get guidance on what to do. However, once the previous problems have been solved, the course leaves teaching on practical aspects. I do not recommend the course for a beginner as an advanced level of understanding of the concepts is required to be able to apply them fluently. 6/10.

Coursera Footer

Learn something new.

Popular Data Science Courses

Popular Computer Science & IT Courses

Popular Business Courses

Placeholder

IMAGES

  1. GitHub

    coursera ibm data analyst capstone project github examples

  2. GitHub

    coursera ibm data analyst capstone project github examples

  3. GitHub

    coursera ibm data analyst capstone project github examples

  4. IBM Coursera Advanced Data Science Capstone

    coursera ibm data analyst capstone project github examples

  5. Free Online Course: IBM Data Analyst Capstone Project from Coursera

    coursera ibm data analyst capstone project github examples

  6. GitHub

    coursera ibm data analyst capstone project github examples

VIDEO

  1. Capstone project EDA on hotel booking analysis

  2. similar case study on Capstone project 1 video1046038852

  3. IBM Advanced Data Science Specialization Helpline (1) 7.8.18 4 PM CEST #coursera

  4. New Capstone Projects

  5. Get that ā€œI learned something newā€ feeling! šŸ’šŸ»ā€ā™€ļø

  6. IBM Data Science Professional Certificate FREE

COMMENTS

  1. ibm-data-analyst-professional Ā· GitHub Topics Ā· GitHub

    GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 330 million projects. ... b06601024 / Coursera-IBM-Data-Analyst Star 15. Code Issues Pull requests ... Capstone projects of the IBM Data Analyst Professional . python data-science sql excel pandas data-visualization data ...

  2. GitHub

    IBM Data Analyst Professional šŸ“ About this Professional Certificate. Gain the job-ready skills for an entry-level data analyst role through this eight-course Professional Certificate from IBM and position yourself competitively in the thriving job market for data analysts, which will see a 20% growth until 2028 (U.S. Bureau of Labor Statistics). ...

  3. IBM Data Analyst Capstone Project

    You will perform the various tasks that professional data analysts do as part of their jobs, including: - Data collection from multiple sources - Data wrangling and data preparation - Exploratory data analysis - Statistical analysis and data mining - Data visualization with different charts and plots, and - Interactive dashboard creation.

  4. IBM Data Analyst Capstone Project

    1700 Coursera Courses That Are Still Completely Free. By completing this final capstone project you will apply various Data Analytics skills and techniques that you have learned as part of the previous courses in the IBM Data Analyst Professional Certificate. You will assume the role of an Associate Data Analyst who has recently joined the ...

  5. PDF Coursera

    the coding in Python with the help of Coursera-IBM Skill lab and GitHub working out a proposal. 4/17/2020 IBM Data Science Capstone Project 2. ... 4/17/2020 IBM Data Science Capstone Project 6. Methodology : Learning IBM Model-CRISP CRISP (Cross Industries Standard Process) through Coursera : ... cities as an example with url1 & url2 , but 2.2 ...

  6. Coursera Capstone Project W1L2

    GitHub Gist: instantly share code, notes, and snippets. ... Instantly share code, notes, and snippets. edownin1 / Coursera Capstone Project W1L2 - Collecting Data Using APIs.ipynb. Last active November 27, 2022 23:24. Star 1 Fork 1 Star Code ... Coursera Capstone Project W1L2 - Collecting Data Using APIs.ipynb ...

  7. Coursera Capstone Project W3

    edownin1 / Coursera Capstone Project W3 - Exploratory Data Analysis.ipynb. Last active 3 weeks ago. Star 1. Fork 1. Code Revisions 3 Stars 1 Forks 1. Embed. Download ZIP. Raw. Coursera Capstone Project W3 - Exploratory Data Analysis.ipynb.

  8. Data Engineering Capstone Project

    About this Course. In this course you will apply a variety of data engineering skills and techniques you have learned as part of the previous courses in the IBM Data Engineering Professional Certificate. You will assume the role of a Junior Data Engineer who has recently joined the organization and be presented with a real-world use case that ...

  9. Data Science with R

    In this capstone course, you will apply various data science skills and techniques that you have learned as part of the previous courses in the IBM Data Science with R Specialization or IBM Data Analytics with Excel and R Professional Certificate. For this project, you will assume the role of a Data Scientist who has recently joined an ...

  10. Learner Reviews & Feedback for IBM Data Analyst Capstone Project Course

    Find helpful learner reviews, feedback, and ratings for IBM Data Analyst Capstone Project from IBM Skills Network. Read stories and highlights from Coursera learners who completed IBM Data Analyst Capstone Project and wanted to share their experience. This is a great course. I learned so much about data science. I appreciate all the help I received. ...