De Anza logo Course Outlines

Public Search

 
 
Close Window/Tab
PRINT VIEW -- Opens in new, second window. Use browser controls to close when finished.
Credit- Degree applicable
Effective Quarter: Fall 2020

I. Catalog Information

CIS 9
Introduction to Data Science
4.5 Unit(s)

 

Requisites: Prerequisite: CIS 41A.

Hours: Lec Hrs: 48.00
Lab Hrs: 18.00
Out of Class Hrs: 96.00
Total Student Learning Hrs: 162.00

Description: This course is an introduction to data science, which covers data analytics and machine learning. Topics covered include data gathering and data wrangling, data assessment and visualization, supervised and unsupervised machine learning, natural language processing.


Student Learning Outcome Statements (SLO)

 

• Student Learning Outcome: Collect, clean, analyze, and visualize data to meet and defend a measured objective.


 

• Student Learning Outcome: Gather data and choose a model to train and tune the machine learning tool and interpret the result


II. Course Objectives

A.Define and describe data science concepts
B.Apply mathematics and statistics building blocks
C.Apply programming constructs in data science
D.Collect data from multiple sources
E.Apply data wrangling techniques
F.Assess data sets
G.Visualize and present data
H.Apply a learning framework
I.Evaluate supervised learning
J.Evaluate unsupervised learning
K.Evaluate natural language processing

III. Essential Student Materials

 None

IV. Essential College Facilities

 Computer lab with computers running the Python interpreter and Anaconda package

V. Expanded Description: Content and Form

A.Define and describe data science concepts
1.The role of data analytics
2.Machine learning application
B.Apply mathematics and statistics building blocks
1.Linear algebra - notation, vector and matrix operations
2.Statistics - measurement such as mean, median, mode, standard deviation, outlier, correlation, confidence interval
C.Apply programming constructs in data science
1.Data Structures
a.Array
b.Data frame
2.Data input / output
3.Operators
4.Functions
5.Plots
D.Collect data from multiple sources
1.Define objective
2.Web scraping
3.Web API
4.Importing file
a.HTML
b.text
c.CSV
d.JSON
E.Apply data wrangling techniques
1.Joining data
2.Data validation
3.Data cleaning
F.Assess data sets
1.Exploratory data analysis
2.Data filtering, sorting
3.Searching, retrieving data
4.Time series
G.Visualize and present data
1.Univariate and multivariate plots
2.Defense of results
H.Apply a learning framework
1.Loading dataset
2.Learning and predicting
3.Saving models
I.Evaluate supervised learning
1.Training, predicting
2.Evaluating model success
J.Evaluate unsupervised learning
1.Clustering
2.Outlier detection
K.Evaluate natural language processing
1.Rule based and statistical NLP
2.Sentiment analysis

VI. Assignments

A.Reading: required reading from textbook and classnotes
B.Programs: 6-8 programming homework assignments, several with 100 or more lines of code.

VII. Methods of Instruction

 Lecture and visual aids
Discussion of assigned reading
Discussion and problem solving performed in class
In-class exploration of Internet sites
Quiz and examination review performed in class
Homework and extended projects
Collaborative learning and small group exercises
Collaborative projects
Laboratory discussion sessions and quizzes that evaluate the proceedings weekly laboratory exercises

VIII. Methods of Evaluating Objectives

A.Evaluation of programming assignments and reports for correctness, use of design principles, documentation and efficiency.
B.One or more examinations requiring programming ability to develop an algorithm, evaluate code segments, and write code using theories presented in the course.
C.In-class lab problems, group collaborative problems, exam questions and/or online assignments or tutorials demonstrating the ability to read and analyze code through debugging and/or writing snippets of code.
D.A final examination requiring programming ability to develop algorithms, evaluate code segments, and write code using theories presented in the course.

IX. Texts and Supporting References

A.Examples of Primary Texts and References
1.Igual, Laura and Segui, Santi: Introduction to Data Science, 1st Edition. Springer. ISBN 978-3-319-50017-1. 2017
2.Saltz, Jeffrey: An Introduction to Data Science, 1st Edition, Sage Publishing, ISBN: 978-1506377537, 2018
B.Examples of Supporting Texts and References
1.Hasti, Trevor: The Elements of Statistical Learning, 2nd Edition, Springer, ISBN: 978-0387848570, 2017

X. Lab Topics

A.Present, evaluate, and explain the role and application of data science in modern life.
B.Solve mathematical and statistical problems with large data sets.
C.Write code to perform mathematical and statistical work on input data to produce and present the result.
D.Write code to locate and read data from multiple types of data source.
E.Write code to join, validate, and clean input data to prepare for analysis.
F.Write code to filter, search, sort data to arrive at a conclusion on the data trend.
G.Write code to utilize the appropriate plots to visualize data.
H.Write code to work with a machine learning model to train and predict data.
I.Write code to provide input data for a machine learning model for unsupervised learning.
J.Write code to process text and predict outcome.