CMDA 3654
Information
Topic: Intro to Data Analytics & Visualization
Lecture: In person (MWF 1:25-2:15PM) at DER 3083
Instructor: Xin (Shayne) Xing, Email: xinxing AT vt.edu
Office hours: 2:30 PM - 5:00 PM Wednesday in Room 402 Hutcheson Hall
zoom link(Note that in-person meeting has higher priority. Zoom meeting might have more waiting time)
TA: Xueying Liu , Email: xliu96 AT vt.edu
TA office hours: 3:00 PM - 4:00 PM Wed zoom link
Syllabus
Course description & Prerequisites
Basic principles in data analytics; supervised and unsupervised statistical methods; basic deep learning methods for supervised learning; data visualization of standard-size and large size datasets; basic programming language: R.
This course is a required course for the Computational Modeling and Data Analytics Degree. The course sequence is listed at the 3000 level so that the students will have been previously exposed to introductory mathematics (linear algebra, multivariate calculus), a programming language, introductory statistics (basic mathematical statistics), and probability. Students without strong preparation in these will need to invest significant additional time to fill in the gaps.
All class materials are distributed online; for example, you may view most class notes and homework assignments on the Schedule. Canvas is used to report scores from quizzes, homework and the final project.
Recommended Text Book
An Introduction to Statistical Learning with Applications in R
Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani
Advanced topics:
The Elements of
Statistical Learning: Data Mining, Inference, and Prediction
Trevor Hastie,Robert Tibshirani,Jerome Friedman
Dive into Deep Learning
Aston Zhang, Zachary C. Liption, Mu Li, and Alexander J. Smola
In-class Quiz
The in-class quiz (each 10 points) must be submitted on canvas before the end of the class. If the quiz is submitted in time, it will be guaranteed to have at least 5 points. If the quiz is not submitted in time, it will receive a zero score.
Homework Assignments
Weekly homework assignments will be posted in both Schedule and Canvas. Late homework that overdue in 24 hours are penalized to 90% of its total score. Homework that overdue for more than 24 hours would not be accepted, and missed homework receive zero scores. Homework assignments must be submitted at Canvas. Grades will be returned to you on Canvas.
It is expected that students will read the slides and refereed materials listed in the Schedule . Your work must be legible, include name, and be submitted in a single pdf file. You are expected to put in 6-8 hours of work outside of class. A few of you will do well with less time than this, and a few of you will need more. You must write up your final answers and write your own code: copying homework solutions is not allowed.
Final Project
There will be one final project. The final report will include a well-written pdf document including (introduction, data visualization, Model & methods, Results). You must write up your final report and code by your own input.
Final Exam
I will hold the following exam date for the final exam.
Exam Date: December 09, 2022
Begin Time: 1:05PM
End Time: 3:05PM
Grades
Your grade will consist of in-class quiz (5%), homework (50%), a final project (20%), and a final exam (25%).
Quiz | 5% |
---|---|
Homework | 50% |
Final Project | 20% |
Exam | 25% |
The total score is the weighted average of scores in all categories. The total scores in 90-100 are guaranteed at least an A-. The total scores in 80-90 are guaranteed at least an B-. The total scores in 70-80 are guaranteed at least an C-. The total scores in 60 - 69 are guaranteed at least a D-. The lower bound of each interval may be expanded, which depends on the overall performance.
Academic Integrity
The Undergraduate Honor Code pledge that each member of the university community agrees to abide by states:
“As a Hokie, I will conduct myself with honor and integrity at all times. I will not lie, cheat, or steal, nor will I accept the actions of those who do.”
Students enrolled in this course are responsible for abiding by the Honor Code. A student who has doubts about how the Honor Code applies to any assignment is responsible for obtaining specific guidance from the course instructor before submitting the assignment for evaluation. Ignorance of the rules does not exclude any member of the University community from the requirements and expectations of the Honor Code. Academic integrity expectations are the same for online classes as they are for in person classes. All university policies and procedures apply in any Virginia Tech academic environment. For additional information about the Honor Code, please visit: https://www.honorsystem.vt.edu/
Honor Code Pledge for Assignments: The Virginia Tech honor code pledge for assignments is as follows:
“I have neither given nor received unauthorized assistance on this assignment.”
The pledge is to be written out on all graded assignments at the university and signed by the student. The honor pledge represents both an expression of the student’s support of the honor.
The field of Computational Modeling and Data Analytics requires professionals who act with the highest ethical standards. CMDA teaches skills that empower you to have a tremendous impact upon the world. We teach you these skills with the expectation that you will exercise them responsibly.
Responsible practice is a habit forged during your undergraduate studies. CMDA majors demonstrate their sound ethical foundation by completely adhering to the Virginia Tech Honor Code in all their courses. Please read the detailed policy at https://personal.math.vt.edu/embree/cmda_integrity.pdf
Schedule
Time | Materials | Homework |
---|---|---|
Week 01(08/22-08/26) | Week 1: Intro to data science and R Readings: Artificial Intelligence-The Revolution Hasn't Happened Yet Install R, RStudio and knit to pdf on your laptop ( Mac/Windows) An Introduction to R R Markdown (cheat sheet) |
Homework 01 (due on 11:59PM, 09/02.) |
Week 02(08/29-09/02) | Week 2: Intro to R Programming (code) Readings: Find basic R function in (R basic cheat sheet) |
Homework 02 (due on 11:59PM, 09/09.) |
Week 03(09/05-09/09) | Week 3: Intro to R Programming (code) | Homework 03 (due on 11:59PM, 09/16.) |
Week 04(09/12-09/16) | Week 4: R Graphics (code) ggplot2 |
Homework 04 (due on 11:59PM, 09/23.) |
Week 05(09/19-09/23) | Week 5:Data Input and Cleaning (code) tidyr |
Homework 05 (due on 11:59PM, 09/30.) |
Week 06(09/26-09/30) | Week 6: Simple Linear Regression (code) Readings: Orthogonal Projection |
Homework 06 (due on 11:59PM, 10/07.) |
Week 07(10/03-10/07) | Week 7: Multiple Linear Regression | Homework 07 (due on 11:59PM, 10/14.) |
Week 08(10/10-10/14) | Week 8: Logistic Regression (code) | Homework 08 (due on 11:59PM, 10/21.) |
Week 09(10/17-10/21) | Week 9: Shrinkage Regression, LDA and QDA (code) | < Homework 09 (due on 11:59PM, 10/28.) |
Week 10(10/24-10/28) | Week 10: Dimension Reduction (code) | Homework 10 (due on 11:59PM, 11/04.) |
Week 11(10/31-11/04) | Week 11: K-mean Clustering (code) | Homework 11 (due on 11:59PM, 11/12.) |
Week 12(11/07-11/11) | Week 12: Hierarchical Clustering (code) | |
Week 13(11/14-11/18) | Week 13: Fully Connected Neural Network (code) | |
11/21-11/25 | Thanksgiving Week | |
Week 14(11/28-12/02) |