Lectures

  1. Lecture 1: Introduction to Urban Data Science
  2. Lecture 2: Spatial and Urban Data
  3. Lecture 3: Data Grammar
  4. Lecture 4: Data Engineering
  5. Lecture 5 EDA and Visualisation:
  6. Lecture 6: Geo-Visualisation
  7. Lecture 7: Networks and Spatial Weights
  8. Lecture 8: Exploratory Spatial Data Analysis
  9. Lecture 9: Machine Learning for Everyone
  10. Lecture 10: Anatomy of a Learning Algorithm
  11. Lecture 11: Clustering
  12. Lecture 12: Dimensionality Reduction
  13. Lecture 13: Spatial Density Estimation
  14. Lecture 14: Responsible Data Science

A GUIDE TO FOLLOW THIS PAGE:

  • The slides will be updated latest a night before the lecture in pdf format.
  • Lectures will not be recorded or delivered online.
  • The section To do before class provides content that is useful for following the lectures. I expect you to follow it before every lecture. It will take about 1 hour of prep.
  • Section Extra Material is exactly that. It is not required for this course but can prove really helpful for gaining extra knowledge either during or after this course. Sometimes I use it to build the contents of the lecture and at others I find them helpful in my research related to the weekly topics but I will never question your knowledge on that.



Let’s begin

Before starting this course, watch this video by Khalid Kadir about a reflection on poverty (an example of a social problem), expertise and equity. This representation is an example of how experts create boxes around their craft. As a data scientist or a future expert (consultant, data analyst, policymaker, etc.), it is our responsibility to step out of those boxes and engage with communities to strive for just outcomes.




Lecture 1 - Introduction to Urban Data Science

To do before class [Takes about 1 hour of prep at home]

As a way to whet your appetite about the content of the first class, I recommend you:

Extra Material [Always to learn more but never needed for the course]

The contents of this lecture are loosely based on, and explored into further detail, in the following four references :

  • “Chapter 1: Introduction” (Schutt & O’Neil, 2013). Free sampler of the book containing the chapter available online (html, pdf).
  • Excellent overview of Data Science (Donoho, 2017).
  • A Geographic take on Data Science, proposing a new field (Singleton & Arribas-Bel, 2019).
  • A critical approach to Data Science for Cities

References

  1. Schutt, R., & O’Neil, C. (2013). Doing data science: Straight talk from the frontline. “ O’Reilly Media, Inc.”
  2. Donoho, D. (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4), 745–766.
  3. Singleton, A., & Arribas-Bel, D. (2019). Geographic Data Science. Geographical Analysis.



Lecture 2 - Spatial and Urban Data

Slides

To do before class [Takes about 1 hour of prep at home]

  • Watch the TED talk by Carlo Rati about MIT’s SENSEable City Lab projects: excellent set of examples
  • Read the New York Times piece on US buildings map
  • Explore the GHSL Dataset, by the European Commission

Extra Material [Always to learn more but never needed for the course]

  • The part of the lecture on new sources of data relies on (Arribas-Bel, 2014) and (Lazer & Radford, 2017).
  • (Goodchild, 2007): a classic on the rise of volunteered geographic information.
  • (Kitchin, 2014): recent book on the data revolution from a Social Science/Human geography perspective.

References

  1. Arribas-Bel, D. (2014). Accidental, open and everywhere: Emerging data sources for the understanding of cities. Applied Geography, 49, 45–53.
  2. Lazer, D., & Radford, J. (2017). Data ex Machina: Introduction to Big Data. Annual Review of Sociology, (0).
  3. Goodchild, M. F. (2007). Citizens as sensors: the world of volunteered geography. GeoJournal, 69(4), 211–221.
  4. Kitchin, R. (2014). The data revolution: Big data, open data, data infrastructures and their consequences. Sage.



Lecture 3 - Data Grammar

Slides

To do before class [Takes about 1 hour of prep at home]

Extra Material [Always to learn more but never needed for the course]

  • A cheatsheet (such a misnomer – nobody is cheating and it is a helpful and beautiful resource) on Data Wrangling with Pandas that you may want to stick to your wall or put as your screensaver to save time on finding useful and operational codes.



Lecture 4 - Data Engineering

Slides

To do before class [Takes about 1 hour of prep at home]

Extra Material [Always to learn more but never needed for the course]

The contents of this lecture are loosely based on, and explored into further detail, in the following two references :




Lecture 5 - EDA and Visualisation

Slides

To do before class [Takes about 1 hour of prep at home]

Extra Material [Always to learn more but never needed for the course]

  • Berinato, S. Visualisations That Really Work, Harvard Business Review, Jun 2016
  • Wainer, H. How to Display Data Badly. The American Statistician 1984; 38: 137-1470
  • Alberto Cairo’s weblog called The Functional Art about information design, and visualisation is an excellent resource for improving your visualisations.
  • (Yau, 2011)’s book “Visualize this” is a good general introduction to visualisation.
  • Check out From Data to Vis chart selector for selecting the right charts

References

  1. Tufte, E. R. (1983). The visual display of quantitative information. Graphics press Cheshire, CT.
  2. Yau, N. (2011). Visualise this: the FlowingData guide to design, visualisation, and statistics. John Wiley & Sons.



Lecture 6 - Geo-Visualisation

Slides

To do before class [Takes about 1 hour of prep at home]

  • Watch this lecture on “Statistical maps” by Luc Anselin (link to 25min video).
  • Read the Conversation piece on the Flint case, where the MAUP played a key role.
  • Spend the rest of the prep hour browsing through Nathan Yau’s excellent blog, Flowing Data.

Extra Material [Always to learn more but never needed for the course]

References

  1. Rey, S. (2015). Geovisualization. In GPH471: Geographic Information Analysis. Lecture slides from a course taught at Arizona State University.
  2. Brewer, C. (2015). Designing better Maps: A Guide for GIS users. ESRI Press.



Lecture 7 - Networks and Spatial Weights

Slides

To do before class [Takes about 1 hour of prep at home]

  • Read Eli Knaap’s blog on Measuring Urban Segregation with Spatial Computation
  • Watch this lecture on “Spatial Weights” by Luc Anselin (link to 34min video). Keep in mind the motivation, in this case, is focused on spatial regression.
  • Lecture on “Spatial lag” by Luc Anselin (link to video, you can ignore the last five minutes as they are a bit more advanced).

Extra Material [Always to learn more but never needed for the course]

  • Check out Geoff Boeing’s computational notebook showcasing the use of OSMNX- a python library for processing street networks as network objects- with a case of Urban Street Network Analysis
  • For advanced and in-detail treatment, (Anselin & Rey, 2014) is an excellent reference.

References

  1. Anselin, L., & Rey, S. J. (2014). Modern Spatial Econometrics in Practice: A Guide to GeoDa, GeoDaSpace and PySAL. Chicago, IL: GeoDa Press LLC.



Lecture 8 - Exploratory Spatial Data Analysis

Slides

To do before class [Takes about 1 hour of prep at home]

  • Watch this lecture on “Spatial Autocorrelation (Background)” by Luc Anselin. [Part I][Part II]

Extra Material [Always to learn more but never needed for the course]

  • (Anselin, 1996) reviews the use of the Moran plot as an ESDA tool (You may access it on Scihub using the doi https://doi.org/10.1111/j.1467-9787.1996.tb01101.x).
  • (Symanzik, 2014) introduces the main concepts behind ESDA.
  • (Haining, 2014) is an excellent historical perspective of the origins and motivations behind most of the global and local measures of spatial autocorrelation.

References

  1. Anselin, L. (1996). The Moran scatterplot as an ESDA tool to assess local instability in spatial association. Spatial Analytical Perspectives on GIS, 111, 111–125.
  2. Symanzik, J. (2014). Exploratory Spatial Data Analysis. In Handbook of Regional Science (pp. 1295–1310). Springer.
  3. Haining, R. (2014). Spatial Data and Statistical Methods: A Chronological Overview. In Handbook of Regional Science (pp. 1277–1294). Springer.