CE 477 | Course Introduction and Application Information

Course Name
Data Science
Code
Semester
Theory
(hour/week)
Application/Lab
(hour/week)
Local Credits
ECTS
CE 477
Fall/Spring
3
0
3
5

Prerequisites
None
Course Language
English
Course Type
Elective
Course Level
First Cycle
Course Coordinator
Course Lecturer(s)
Assistant(s) -
Course Objectives The course introduces the principles and methods of data science – learning from data for prediction and insight. The course covers the key data science topics including getting data, visualizing and exploring data, statistical analysis of data, and the data science’s use of machine learning. The course focuses on developing hands-on data skills by offering the students to complete a data science project.
Course Description The students who succeeded in this course;
  • will be able to define computer tools to obtain, clean, and analyze data,
  • will be able to apply statistical methods and visualization to explore data.
  • will be able to use statistical and computational methods to make predictions from data.
  • will be able to perform data analysis with machine learning methods,
  • will be able to use statistics and data visualization tools to communicate the outcome of data analyses.
Course Content The following topics will be included: getting and cleaning data, exploring data, statistical models of data, statistical inference, main machine learning methods in data science including linear regression, SVM, k-nearest neighbors, Naïve Bayes, logistic regression, decision trees, random forests, clustering, and dimensionality reduction, over-fitting, cross-validation, feature engineering.

 



Course Category

Core Courses
Major Area Courses
Supportive Courses
Media and Management Skills Courses
Transferable Skill Courses

 

WEEKLY SUBJECTS AND RELATED PREPARATION STUDIES

Week Subjects Related Preparation
1 Introduction: What is Data Science? Relationship of Data Science to Machine Learning Chapter 1. Sections 1.1-1.3. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
2 Getting data: reading files, scraping web, using APIs. Working with data: exploring data, basic data cleaning and munging Chapter 9. Sections 9.1-9.5. Chapter 10. Sections 10.1-10.4. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
3 Exploratory Data Analysis: visualizing data, plots, summary statistics, mean and dispersion Chapter 3. Sections 2.1-1.4. Chapter 5. Sections 5.1. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
4 Elements of probability: populations and samples, random variables, correlation, statistical dependence and independence, Bayes theorem Chapter 6. Sections 6.1-6.5. Chapter 5. Sections 5.2-5.5. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
5 Statistical inference: hypothesis and tests, statistical models, linear models, maximum likelihood inference, p-values, confidence intervals Chapter 7. Sections 7.1-7.6. Chapter 14. Sections 14.1, 14.3. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
6 Using Machine Learning methods for prediction – regression, multivariate linear regression, and k-Nearest Neighbors Chapter 14. Sections 14.1-14.2. Chapter 15. Sections 15.1-15.5. Chapter 12. Sections 12.1-12.2. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
7 Midterm exam
8 Using Machine Learning for prediction – classification, logistic regression, linear discriminant classifier, largest margin classifier (SVM), and Naive Bayes Chapter 16. Sections 16.1-16.5. Chapter 13. Sections 13.1-13.4. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
9 Correctness when using Machine Learning: over-fitting, bias-variance tradeoff, cross-validation, feature selection Chapter 11. Sections 11.4-11.6. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
10 Feature Engineering: designing features, different types of features, relationship of features to models, relationship of data to features. Cleaning data: fixing data formats, fixing missing and damaged data, standardizing data (scaling and whitening) Chapter 3. Sections 3.1-3.4. The Art of Data Science, R. D. Peng, E. Matsui; Chapter 4. Section 4.1-4.6. Python Machine Learning, S. Raschka, ISBN9781783555147
11 Unsupervised data exploration – hierarchical clustering, k-means clustering Chapter 19. Sections 19.1-19.6. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
12 Unsupervised data exploration – association mining, dimensionality reduction Chapter 10. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
13 Decision Trees and Random Forests Chapter 17. Sections 17.1-17.6. Data Science from Scratch: First Principles with Python, J. Grus, ISBN9781491901427
14 Project presentations
15 Project presentations
16 General semester review

 

Course Notes/Textbooks

J. Grus, “Data Science from Scratch: First Principles with Python”, O’Reilly Media, 2015, ISBN9781491901427 ; 9781491904381 (Ebook)

Suggested Readings/Materials

T. Hastie, R. Tibshirani, J. Friedman “The Elements of Statistical Learning”, Springer, 2013, ISBN 9780387216065; S. Raschka, “Python Machine Learning”, Packt Publishing, 2015, ISBN 9781783555147; R. D. Peng, E. Matsui, “The Art of Data Science”, https://leanpub.com/artofdatascience

 

EVALUATION SYSTEM

Semester Activities Number Weigthing
Participation
Laboratory / Application
Field Work
Quizzes / Studio Critiques
Homework / Assignments
Presentation / Jury
Project
1
25
Seminar / Workshop
Portfolios
Midterms / Oral Exams
1
25
Final / Oral Exam
1
50
Total

Weighting of Semester Activities on the Final Grade
2
50
Weighting of End-of-Semester Activities on the Final Grade
1
50
Total

ECTS / WORKLOAD TABLE

Semester Activities Number Duration (Hours) Workload
Course Hours
Including exam week: 16 x total hours
16
3
48
Laboratory / Application Hours
Including exam week: 16 x total hours
16
Study Hours Out of Class
14
2
Field Work
Quizzes / Studio Critiques
Homework / Assignments
Presentation / Jury
Project
1
30
Seminar / Workshop
Portfolios
Midterms / Oral Exams
1
20
Final / Oral Exam
1
24
    Total
150

 

COURSE LEARNING OUTCOMES AND PROGRAM QUALIFICATIONS RELATIONSHIP

#
Program Competencies/Outcomes
* Contribution Level
1
2
3
4
5
1 Adequate knowledge in Mathematics, Science and Software Engineering; ability to use theoretical and applied information in these areas to model and solve Software Engineering problems
2 Ability to identify, define, formulate, and solve complex Software Engineering problems; ability to select and apply proper analysis and modeling methods for this purpose
3 Ability to design, implement, verify, validate, measure and maintain a complex software system, process or product under realistic constraints and conditions, in such a way as to meet the desired result; ability to apply modern methods for this purpose
4 Ability to devise, select, and use modern techniques and tools needed for Software Engineering practice
5 Ability to design and conduct experiments, gather data, analyze and interpret results for investigating Software Engineering problems
6 Ability to work efficiently in Software Engineering disciplinary and multi-disciplinary teams; ability to work individually
7 Ability to communicate effectively in Turkish, both orally and in writing; knowledge of a minimum of two foreign languages
8 Recognition of the need for lifelong learning; ability to access information, to follow developments in science and technology, and to continue to educate him/herself
9 Awareness of professional and ethical responsibility
10 Information about business life practices such as project management, risk management, and change management; awareness of entrepreneurship, innovation, and sustainable development
11 Knowledge about contemporary issues and the global and societal effects of engineering practices on health, environment, and safety; awareness of the legal consequences of Software Engineering solutions

*1 Lowest, 2 Low, 3 Average, 4 High, 5 Highest