Final project

### Final Project Instructions

Module 6: Collaborative group project Instructions

As the final assignment in this course, I want you all to be more creative, take some risks, and share a project. Therefore, please use at least one of the methods (i.e., Regression Analysis, Hypothesis Test, Regularization, Data, Classification/Data Mining, and Time Series Analysis) we have learned from the first five modules to analyze a real world data of interest to you with a goal of solving a problem you pose.

Please come up with some research questions based on the data you chose and apply the analytical methods you have learned in this course, covered in Modules 1-5. Please prepare a written report and presentation based on your findings. The written report will typically include the following information:

• a cover page with all the student names in your group, course/section, institution, instructor’s name, the assignment title, and the date.
• a section on the research questions
• a section on the data set(s) chosen
• a section on the method(s) chosen
• a section on results & findings
• a section on conclusions
• references (if any)
• appendix: your r script or your analyses document in any other format (excel, tableau, python, etc.)

The presentation should include similar information (research question, description of the data, summary of results, conclusions, and limitations), but be briefer. The presentation should be between 5-10 minutes.  For your presentation: don’t focus on math or statistical details. Do highlight the data storytelling and visualization part.  Your presentation order is shown in the section below.

You shall work in a group of two or three students at most. Your project should follow APA format, however, i won’t be posing any page limitations but I expect thorough report. Please approach this as your chance to investigate something of interest to you. You should include images, graphs, figures, charts, and tables as necessary.

If you have any questions, please contact me. However, do not wait until the last day of the assignment’s due date to email me with questions/concerns. Start working on it as soon as you can to have a higher chance of success.

The presentations will be given during our normal class time . The written report is due on Wednesday June 26 11:59 pm. Please list the contributions of all group members in your presentation and final report.

I look forward to seeing your creative and analytical work. I know if you put in the effort, you can have some very fascinating results!

Some Data Sources:

1) Regression

i) Red Wine Quality

https://www.kaggle.com/uciml/red-wine-quality-cortez-et-al-2009

2) Classification

i)Titanic dataset

https://www.kaggle.com/c/titanic/data

3) Clustering

i) Credit card data

https://www.kaggle.com/arjunbhasin2013/ccdata

4) Time series analysis

i) Bitcoin price forecasting

https://www.kaggle.com/ara0303/forecasting-of-bitcoin-prices

