KGC


Keio University Shonan Fujisawa Campus
Course Summary (Syllabus)


INTERNET MEASUREMENT AND DATA ANALYSIS (Kenjiro Cho

    Semester : 2014 Fall
    Code : C2083 / 2 Credits


1. Objectives/Teaching method

    In this class, you will learn about data collection and data
    analysis methods on the Internet, to obtain knowledge and
    understanding of networking technologies and large-scale data analysis.

    Each class will provide specific topics where you will learn the
    technologies and the theories behind the technologies.
    In addition to the lectures, each class includes programming exercises
    to obtain data analysis skills through the exercises.


2. Materials/Reading List

    The lecture slide materials will be provided online.

    ruby: http://www.ruby-lang.org/
    gnuplot: http://gnuplot.info/

    [1] Mark Crovella and Balachander Krishnamurthy.
    Internet measurement: infrastructure, traffic, and applications.
    Wiley, 2006.
    [2] Pang-Ning Tan, Michael Steinbach and Vipin Kumar.
    Introduction to Data Mining.
    Addison Wesley, 2006.
    [3] Raj Jain.
    The art of computer systems performance analysis.
    Wiley, 1991.
    [4] Toby Segaran.
    Programming Collective Intelligence.
    O'Reilly Media. 2007.
    [5] Allen B. Downey.
    Think Stats: Probability and Statistics for Programmers.
    O'Reilly Media. 2011.
    [6] Chris Sanders.
    Practical Packet Analysis, 2nd Edition
    No Starch Press. 2011.


3. SCHEDULE

    #1 Introduction
    - Big Data and Collective Intelligence
    - Internet measurement
    - Large-scale data analysis
    - exercise: introduction of Ruby scripting language

    #2 Data and variability
    - Summary statistics
    - Sampling
    - How to make good graphs
    - exercise: graph plotting by Gnuplot

    #3 Data recording and log analysis
    - Network management tools
    - Data format
    - Log analysis methods
    - exercise: log data and regular expression

    #4 Distribution and confidence intervals
    - Normal distribution
    - Confidence intervals and statistical tests
    - Distribution generation
    - exercise: confidence intervals
    - assignment 1

    #5 Diversity and complexity
    - Long tail
    - Web access and content distribution
    - Power-law and complex systems
    - exercise: power-law analysis

    #6 Correlation
    - Online recommendation systems
    - Distance
    - Correlation coefficient
    - exercise: correlation analysis

    #7 Multivariate analysis
    - Data sensing and GeoLocation
    - Linear regression
    - Principal Component Analysis
    - exercise: linear regression

    #8 Time-series analysis
    - Internet and time
    - Network Time Protocol
    - Time series analysis
    - exercise: time-series analysis
    - assignment 2

    #9 Topology and graph
    - Routing protocols
    - Graph theory
    - exercise: shortest-path algorithm

    #10 Anomaly detection and machine learning
    - Anomaly detection
    - Machine Learning
    - SPAM filtering and Bayes theorem
    - exercise: naive Bayesian filter

    #11 Data Mining
    - Pattern extraction
    - Classification
    - Clustering
    - exercise: clustering

    #12 Search and Ranking
    - Search systems
    - PageRank
    - exercise: PageRank algorithm


4. Assignments/Examination/Grad Eval.

    2 assignments and a final report.


5. Special Note

    The prerequisites for the class are basic programming skills and basic
    knowledge about statistics.

    In order to understand the theories, basic knowledge of algebra and
    statistics is required.
    In the exercises and assignments, you will need to write programs to
    process large data sets, using the Ruby scripting language and the
    Gnuplot plotting tool.
    To understand the theoretical aspects, you will need basic knowledge
    about algebra and statistics. However, the focus of the class is to
    understand how mathematics is used for engineering applications.


6. Prerequisit / Related courses

    The prerequisites for the class are basic programming skills and basic knowledge about statistics.


7. Conditions to take this course

    -


8. Relation with past courses

    -


9. Course URL


2014-07-07 11:40:28.79667


Powered by SOI Copyright(c) 2002-2019, Keio University Shonan Fujisawa Campus. All rights reserved.
このサイトの著作権について