De Anza logo

Credit- Degree applicable
Effective Quarter: Fall 2020

I. Catalog Information


CIS 64H
R Programming
4 1/2 Unit(s)
 

Advisory: EWRT 211 and READ 211 (or LART 211), or ESL 272 and 273; CIS 22A or CIS 36A or CIS 40.

Lec Hrs: 48.00
Lab Hrs: 18.00
Out of Class Hrs: 96.00
Total Student Learning Hrs: 162.00

This course is an introduction to the R programming language and its utility in big data analytics. Topics covered include data objects, data cleansing, merging and sorting, statistical analysis of data, data graphics and visualization, and working with R-Studio.


Student Learning Outcome Statements (SLO)

 

Design, implement and debug R programs to process data from various sources for data analysis.


 

Use R-graphics to display and visualize data.


II. Course Objectives

A.Describe R basics
B.Exhibit understanding of R data objects.
C.Illustrate basic data transformation concepts.
D.Demonstrate extracting data from various sources.
E.Perform data manipulations to enable analysis.
F.Analyze data to derive patterns and hypotheses.
G.Design data visualizations to demonstrate analyses.

III. Essential Student Materials

 None

IV. Essential College Facilities

 Access to a computer lab with RStudio

V. Expanded Description: Content and Form

A.Describe R basics
1.What is R?
2.Introduction to R and RStudio
3.Installing and using R packages
4.Working with R workspaces
B.Exhibit understanding of R data objects.
1.Vectors
2.Matrices
3.Data Frames
4.Lists
5.Local data import/export
C.Illustrate basic data transformation concepts.
1.Variables
2.Character and String Manipulation
3.Dates and Timestamps
4.Regular Expressions
5.Control Statements
6.Functions
D.Demonstrate extracting data from various sources.
1.Web data capture
2.API data sources
3.Connecting to external data sources
4.Data in single and distributed environments
E.Perform data manipulations to enable analysis.
1.Using 'dplyr'
2.Reshaping data
3.Cleansing data
4.Merging data
5.Splitting data
6.Conversion of data
F.Analyze data to derive patterns and hypotheses.
1.Data architecture patterns
2.Correlation clustering
3.Predictive analysis
4.Groupwise operations
5.Data redundancy
6.Descriptive statistics
7.Regression
8.Hypothesis testing
G.Design data visualizations to demonstrate analyses.
1.Core concepts of data graphics and visualization
2.R graphics engines
a.Base
b.Grid
c.Lattice
d.ggplot2
3.Customizing graphics with 'ggplot2'
a.Titles
b.Coordinate systems
c.Scales
d.Themes
e.Axis labels
f.Legends

VI. Assignments

A.Reading: Required reading from the textbook and class notes
B.Programs: 7-10 programming homework assignments.
C.Group Project: Data exploration and visualization of assigned datasets.

VII. Methods of Instruction

 Lecture and visual aids
Discussion of assigned reading
Discussion and problem solving performed in class
Collaborative learning and small group exercises
Collaborative projects

VIII. Methods of Evaluating Objectives

A.One or two midterm examinations requiring some programming, concepts clarification and exhibiting mastery of R programming constructs presented in the course.
B.A final examination requiring concepts clarification and exhibiting mastery of data exploration, analysis and visualization principles.
C.Evaluation of programming assignments and group project, based on correctness, documentation, code quality, and test plan executions.

IX. Texts and Supporting References

A.Examples of Primary Texts and References
1.Wickham, Hadley and Grolemund, Garrett: R for Data Science: Import, Tidy, Transform, Visualize, and Model Data 1st Edition. O'Reilly. ISBN-13: 978-1491910399, 2017.
2.Campbell, Matthew: Learn RStudio IDE: Quick, Effective, and Productive Data Science 1st Edition. Apress. ISBN-13: 978-1484245101, 2019.
B.Examples of Supporting Texts and References
1.Matloff, Norman: The Art of R Programming: A Tour of Statistical Software Design 1st Edition. William-Pollock. ISBN-13: 978-1593273842, 2011.
2.Teetor, Paul: R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics 1st Edition. O'Reilly. ISBN-13: 978-0596809157, 2011.

X. Lab Topics

A.Data types and data structures
B.Flow control and looping
C.Writing and calling functions
D.Split/apply/combine pattern
E.Working with character data and regular expressions
F.Regular expressions and web scraping
G.Reshaping data and database access
H.Simulation
I.Optimization
J.Data and predictive analysis