Data analytics and visualization is an emerging discipline of immense importance to any data-driven organization. This is a project-focused course that provides students with knowledge on tools for data mining and visualization and practical experience working with data mining and machine learning algorithms for analysis of very large amounts of data. It also focuses on methods and models for efficient communication of data results through data visualization.
Lectures: Wed, 13:00 - 16:00 at BRG 213 (Bergeron building)
Office Hours: Thu, 16:00 - 17:00 at LAS3050 (or right after class)
The course will rely mainly on the following textbooks.
Download the syllabus (v1.0)
Introduction, administrivia.
Readings:
Introduction to networks, introduction to main problems about network analysis, basic mathematical concepts, bow-tie structure of the Web.
Readings:
Optional readings:
Degree distributions, shortest paths, clustering coefficient, measuring power-laws.
Readings:
Optional readings:
Erdos-Renyi random graph model, small-world model, configuration model, power-law distributions, scale-free networks, the anatomy of the long-tail, preferrential attachment model.
Readings:
Optional readings:
Web search, Hubs and Authorities (HITS), PageRank, topic-sensitive PageRank, personalised PageRank, Link prediction, neighborhood-based prediction methods, node proximity based prediction methods, supervised learning models, Facebook's "PYMK" algorithm, Twitter's "WtF" algorithm.
Readings:
Optional readings:
Network communities, Strength of weak ties, community detection, Girvan-Newman algorithm, modularity, modularity optimization, graph partitioning, graph cuts, conductance, spectral graph theory, spectral graph clustering.
Readings:
Optional readings:
Spreading through networks, Granovetter’s model of collective action, decision based model of diffusion, game theoretic model of cascades, probabilistic models of diffusion, epidemic model based on trees, models of disease spreading (SIR, SIS, SIRS), independent cascade model, modeling interactions between contagions.
Readings:
Optional readings:
Anscombe's quartet, Bertin's visual variables, cognition and perception, colors, pre-attentive vs attentive processing, Gestalt principles, visual metaphors, Tufte's principles of graphical excellence, data sculpture.
Readings:
Taxonomy of visualization, visualizations qualitative and quantitative data (comparisons, proportions, relationships, hierarchies, maps, part-to-a-whole, distributions, patterns).
Readings:
Review class material, network analytics, data visualization.
Readings:
Project-focused course; no assignments.
Project handout
Past Project
Online resources of data.
Online resources of network data
Online data visualization resources
Data cleansing/wrangling
Graph/network analysis
Graph/network exploration and visualization
Data Visualization
A list of useful online tutorials relating to the course material
Similar courses about information networks and network analysis