- No suggested jump to results
- Notifications

Name already in use
Coursera-data-science-capstone / final-project-submission.rmd.
- Go to file T
- Go to line L
- Copy permalink
- Open with Desktop
- View git blame
- Reference in new issue
Data Science Capstone Final Project
This presentation is a short description of a project that will predict the next word of a sentence fragment or phrase.
The application is a capstone project for the Coursera Data Science Specialization provided by Johns Hopkins University with support by Swiftkey.
The main goal is to develop a predictive algorithm. The front end will be as a shiny application and the backend will utilize R.
The application was developed using a sample of twitter tweets (English). This sample was provided by Swiftkey.
There are German, English, Finn, and Russian version. This application will only use the English version.
After loading all of the English data, the algorithm pulled the number of lines, removed profanity and tokenization. The tokenization was organized into n-gram sequences.
The result is a bigram, trigram and a quadrigram models and converted into frequency dictionaries sorted by freq number.
Application
The application relied on functionality and simplicity. By default when loading the application it will check for a word and a message will show requiring entering a word or phrase.
The user can now enter a word or phrase. The application will require the user hit submit. When this happens 3 items will display.
- The next word
- The R application display of what has transpired. This is for debugging.
The application started at quadrigram and worked its way down to determine if it can find a predictive word.
The prediction app is hosted on the shinyapps.io location: https://zagnut.shinyapps.io/shiny/
The code for this frontend application is hosted here: https://github.com/motticus/capstone
The data for this application is hosted here: https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip
- Coursera Data Science Capstone Final Project
- by Manuel A. Diaz R.
- Last updated over 2 years ago
- Hide Comments (–) Share Hide Toolbars
Twitter Facebook Google+
Or copy & paste this link into an email or IM:
Data Science Capstone Projects #18
Ekaterina Butyugina

Cortexia: Sustainable Clean City - Darkzones Analytics
Students: Dominik Bacher , Valeriia Rutskaia

Talmis: Macroeconomic forecasting using machine learning methods
Students: Hussam Al-Homsi , Patrizia Will
- First, they applied time-series clustering to group the 196 countries into clusters with similar historical trends/shapes of the respective MEV.
- Then, proceeded to perform statistical filtering using the Granger Causality Test and thereby select countries with higher predictive power towards their targeted country per respective MEV (we used p < 0.05).
- Finally, by applying a combination of Facebook’s additive model “Prophet” and the multivariate vector autoregressive model (VAR) they were able to stepwise predict the MEVs year by year.

- Additional algorithms should be tested to expand and deepen the understanding of the resilience of the banks.
- The global MEV data set should be enhanced and include quarterly data to allow for higher precision forecasting.
- The approach does not include weighing in the trading relations between the countries. For instance, countries with stronger ties in global trade should receive more weight by the model than countries with lower mutual trade volume. This factor should be included as the next step in future models.
CancerDataNet: Time predictions for follow-up treatment in cancer patients
Students: Muchun Zhong , Jacques Stimolo , Ernest Mihelj
- In the first part, they conducted research within the medical study documentation and the data to gain a better understanding of the data and hence to find anomalies in it.
- The second step was the cleaning of the data, where they removed the anomalous data and cleaned the data based on the missing rate.
- The final step was to take the final version of the dataset and create a synthetic replacement for the missing values (imputation). Muchun, Jacques and Ernest implemented different strategies to impute the data and compared the performance/accuracy of the prognostic models.

360° Stock Prediction: Predicting the highest return stocks globally via robust KPIs and perceived company confidence
Students: Karim Khalil , Fernando Beato , Lukas Doboczky, Rafael Zack

Interested in reading more about Constructor Learning and tech related topics? Then check out our other blog posts.

JHU-Data-Science-Capstone
Coursera Data Science Specialization
View the Project on GitHub
JHU Data Science Capstone Project
The completed project.
A Shiny App for predicting the next word in a string. The App
The Project
Project Overview Sylllabus

Project Tasks - Instructions
Task 0: Understanding the Problem Task 1: Getting and Cleaning the Data Task 2: Exploratory Data Analysis Task 3: Modeling Task 3A: Milestone Report Task 4: Prediction Model Task 5: Creative Exploration Task 6: Data Product Task 7: Slide Deck Task 8: Final Project
Project Scripts - Solutions
Task 0: Exploring the tm Package Task 1: Getting and Cleaning the Data Task 2: Exploratory Data Analysis Task 3A: Milestone Report Task 4: Working toward a Prediction Model Task 04A: Fast Ngram Files Task 05: Prediction Model Task 06: Shiny App Task 06A: Shiny App Source Code Task 07: Slide Presentation
Course Quizzes
Quiz 1 Quiz 2 Quiz 3
Tidy Data Text Mining with R: A Tidy Approach
Capstone Projects
Education is one of the pillars of the data science institute..
Through educational activities, we strive to create a community in Data Science at Columbia. The capstone project is one of the most lauded elements of our MS in Data Science program. As a final step during their study at Columbia, our MS students work on a project sponsored by a DSI industry affiliate or a faculty member over the course of a semester.
Faculty-Sponsored Capstone Projects
A DSI faculty member proposes a research project and advises a team of students working on this project. This is a great way to run a research project with enthusiastic students, eager to try out their newly acquired data science skills in a research setting. This is especially a good opportunity for developing and accelerating interdisciplinary collaboration.
2022-2023 Academic Year FALL 2022: July 15, 2022 SPRING 2023: TBA
Project Archive
- Spring 2022
- Spring 2020
- Spring 2019
- Spring 2018
- Spring 2016
- Art & Design
- Computer Science
- Data Science
- Education & Teaching
- Health & Medicine
- Mathematics
- Programming
- Social Sciences
Professional and Lifelong Learning
In-person, blended, and online courses, data science: capstone.

- Professional Certificate in Data Science
- Data Analysis
- Data Visualization
- Probability
Associated Schools

Harvard T.H. Chan School of Public Health
What you'll learn.
- How to apply the knowledge base and skills learned throughout the series to a real-world problem
- Independently work on a data analysis project

Course description
To become an expert data scientist you need practice and experience. By completing this capstone project you will get an opportunity to apply the knowledge and skills in R data analysis that you have gained throughout the series. This final project will test your skills in data visualization, probability, inference and modeling, data wrangling, data organization, regression, and machine learning.
Unlike the rest of our Professional Certificate Program in Data Science, in this course, you will receive much less guidance from the instructors. When you complete the project you will have a data product to show off to potential employers or educational programs, a strong indicator of your expertise in the field of data science.

Rafael Irizarry
You may also like.

Using Data to Design Your Workplace: Offices, Technology, and People

Causal Diagrams: Draw Your Assumptions Before Your Conclusions

Introduction to Bioconductor
Get updates on new courses..

IMAGES
VIDEO
COMMENTS
Final Project for IBM Data Science Certificate. Contribute to Roderic19/IBM-Applied-Data-Science-Capstone development by creating an account on GitHub.
title: "Coursera Data Science Capstone - Final Project Submission". author: "®γσ, Eng Lian Hu". date: "4/28/2016". output: revealjs::revealjs_presentation:.
Data Science Capstone Final Project. This project is using own knowledge of data science and basic knowledge of NLPin R to build an app that can predict
Data Science Capstone Final Project. David Mott. This presentation is a short description of a project that will predict the next word of a sentence
This is the final project of the Coursera Data Science Capstone. In this project, a Word Predictor application was creating using words or
Introduction. This presentation is created to present the final assignment for the Data Sciences Capstone Course, from Coursera course.
A full description of the Data Science Student's final projects.
Coursera Data Science Specialization. ... JHU Data Science Capstone Project. The Completed Project ... Task 7: Slide Deck · Task 8: Final Project
As a final step during their study at Columbia, our MS students work on a project sponsored by a DSI industry affiliate or a faculty member over the course of a
This final project will test your skills in data visualization, probability, inference and modeling, data wrangling, data organization, regression, and machine