Lectures

  1. Lecture 1: Introduction to Urban Data Science
  2. Lecture 2: Spatial and Urban Data
  3. Lecture 3: Data Grammar
  4. Lecture 4: Data Engineering
  5. Lecture 5 EDA and Visualisation:
  6. Lecture 6: Geo-Visualisation
  7. Lecture 7: Networks and Spatial Weights
  8. Lecture 8: Exploratory Spatial Data Analysis
  9. Lecture 9: Machine Learning for Everyone
  10. Lecture 10: Anatomy of a Learning Algorithm
  11. Lecture 11: Clustering
  12. Lecture 12: Dimensionality Reduction
  13. Lecture 13: Spatial Density Estimation
  14. Lecture 14: Responsible Data Science

A GUIDE TO FOLLOW THIS PAGE:

  • The slides will be updated latest a night before the lecture in pdf format.
  • Lectures will not be recorded or delivered online.
  • The section To do before class provides content that is useful for following the lectures. I expect you to follow it before every lecture. It will take about 1 hour of prep.
  • Section Extra Material is exactly that. It is not required for this course but can prove really helpful for gaining extra knowledge either during or after this course. Sometimes I use it to build the contents of the lecture and at others I find them helpful in my research related to the weekly topics but I will never question your knowledge on that.



Let’s begin

Before starting this course, watch this video by Khalid Kadir about a reflection on poverty (an example of a social problem), expertise and equity. This representation is an example of how experts create boxes around their craft. As a data scientist or a future expert (consultant, data analyst, policymaker, etc.), it is our responsibility to step out of those boxes and engage with communities to strive for just outcomes.




Lecture 1 - Introduction to Urban Data Science

To do before class [Takes about 1 hour of prep at home]

As a way to whet your appetite about the content of the first class, I recommend you:

Extra Material [Always to learn more but never needed for the course]

The contents of this lecture are loosely based on, and explored into further detail, in the following four references :

  • “Chapter 1: Introduction” (Schutt & O’Neil, 2013). Free sampler of the book containing the chapter available online (html, pdf).
  • Excellent overview of Data Science (Donoho, 2017).
  • A Geographic take on Data Science, proposing a new field (Singleton & Arribas-Bel, 2019).
  • A critical approach to Data Science for Cities

References

  1. Schutt, R., & O’Neil, C. (2013). Doing data science: Straight talk from the frontline. “ O’Reilly Media, Inc.”
  2. Donoho, D. (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4), 745–766.
  3. Singleton, A., & Arribas-Bel, D. (2019). Geographic Data Science. Geographical Analysis.



Lecture 2 - Spatial and Urban Data

Slides

To do before class [Takes about 1 hour of prep at home]

  • Watch the TED talk by Carlo Rati about MIT’s SENSEable City Lab projects: excellent set of examples
  • Read the New York Times piece on US buildings map
  • Explore the GHSL Dataset, by the European Commission

Extra Material [Always to learn more but never needed for the course]

  • The part of the lecture on new sources of data relies on (Arribas-Bel, 2014) and (Lazer & Radford, 2017).
  • (Goodchild, 2007): a classic on the rise of volunteered geographic information.
  • (Kitchin, 2014): recent book on the data revolution from a Social Science/Human geography perspective.

References

  1. Arribas-Bel, D. (2014). Accidental, open and everywhere: Emerging data sources for the understanding of cities. Applied Geography, 49, 45–53.
  2. Lazer, D., & Radford, J. (2017). Data ex Machina: Introduction to Big Data. Annual Review of Sociology, (0).
  3. Goodchild, M. F. (2007). Citizens as sensors: the world of volunteered geography. GeoJournal, 69(4), 211–221.
  4. Kitchin, R. (2014). The data revolution: Big data, open data, data infrastructures and their consequences. Sage.



Lecture 3 - Data Grammar

Slides

To do before class [Takes about 1 hour of prep at home]

Extra Material [Always to learn more but never needed for the course]

  • A cheatsheet (such a misnomer – nobody is cheating and it is a helpful and beautiful resource) on Data Wrangling with Pandas that you may want to stick to your wall or put as your screensaver to save time on finding useful and operational codes.



Lecture 4 - Data Engineering

Slides

To do before class [Takes about 1 hour of prep at home]

Extra Material [Always to learn more but never needed for the course]

The contents of this lecture are loosely based on, and explored into further detail, in the following two references :




Lecture 5 - EDA and Visualisation

Slides

To do before class [Takes about 1 hour of prep at home]

Extra Material [Always to learn more but never needed for the course]

  • Berinato, S. Visualisations That Really Work, Harvard Business Review, Jun 2016
  • Wainer, H. How to Display Data Badly. The American Statistician 1984; 38: 137-1470
  • Alberto Cairo’s weblog called The Functional Art about information design, and visualisation is an excellent resource for improving your visualisations.
  • (Yau, 2011)’s book “Visualize this” is a good general introduction to visualisation.
  • Check out From Data to Vis chart selector for selecting the right charts

References

  1. Tufte, E. R. (1983). The visual display of quantitative information. Graphics press Cheshire, CT.
  2. Yau, N. (2011). Visualise this: the FlowingData guide to design, visualisation, and statistics. John Wiley & Sons.



Lecture 6 - Geo-Visualisation

Slides

To do before class [Takes about 1 hour of prep at home]

  • Watch this lecture on “Statistical maps” by Luc Anselin (link to 25min video).
  • Read the Conversation piece on the Flint case, where the MAUP played a key role.
  • Spend the rest of the prep hour browsing through Nathan Yau’s excellent blog, Flowing Data.

Extra Material [Always to learn more but never needed for the course]

References

  1. Rey, S. (2015). Geovisualization. In GPH471: Geographic Information Analysis. Lecture slides from a course taught at Arizona State University.
  2. Brewer, C. (2015). Designing better Maps: A Guide for GIS users. ESRI Press.



Lecture 7 - Networks and Spatial Weights

Slides

To do before class [Takes about 1 hour of prep at home]

  • Read Eli Knaap’s blog on Measuring Urban Segregation with Spatial Computation
  • Watch this lecture on “Spatial Weights” by Luc Anselin (link to 34min video). Keep in mind the motivation, in this case, is focused on spatial regression.
  • Lecture on “Spatial lag” by Luc Anselin (link to video, you can ignore the last five minutes as they are a bit more advanced).

Extra Material [Always to learn more but never needed for the course]

  • Check out Geoff Boeing’s computational notebook showcasing the use of OSMNX- a python library for processing street networks as network objects- with a case of Urban Street Network Analysis
  • For advanced and in-detail treatment, (Anselin & Rey, 2014) is an excellent reference.

References

  1. Anselin, L., & Rey, S. J. (2014). Modern Spatial Econometrics in Practice: A Guide to GeoDa, GeoDaSpace and PySAL. Chicago, IL: GeoDa Press LLC.



Lecture 8 - Exploratory Spatial Data Analysis

Slides

To do before class [Takes about 1 hour of prep at home]

  • Watch this lecture on “Spatial Autocorrelation (Background)” by Luc Anselin. [Part I][Part II]

Extra Material [Always to learn more but never needed for the course]

  • (Anselin, 1996) reviews the use of the Moran plot as an ESDA tool (You may access it on Scihub using the doi https://doi.org/10.1111/j.1467-9787.1996.tb01101.x).
  • (Symanzik, 2014) introduces the main concepts behind ESDA.
  • (Haining, 2014) is an excellent historical perspective of the origins and motivations behind most of the global and local measures of spatial autocorrelation.

References

  1. Anselin, L. (1996). The Moran scatterplot as an ESDA tool to assess local instability in spatial association. Spatial Analytical Perspectives on GIS, 111, 111–125.
  2. Symanzik, J. (2014). Exploratory Spatial Data Analysis. In Handbook of Regional Science (pp. 1295–1310). Springer.
  3. Haining, R. (2014). Spatial Data and Statistical Methods: A Chronological Overview. In Handbook of Regional Science (pp. 1277–1294). Springer.



Lecture 9 - Machine Learning for Everyone

you may want to buy The Hundred-Page Machine Learning Book as some chapters will be used in some topics from this point onwards and it is generally a fantastic book to have. If you cannot or do not want to spend $20.00 on the e-copy, email me, and we will figure something out. The author has invested a lot in writing this book, and it is an excellent resource on Machine Learning, even beyond this class.

Slides

To do before class [Takes about 1 hour of prep at home]

Extra Material [Always to learn more but never needed for the course]

The contents of this lecture are loosely based on, and explored into further detail, in the following two references :




Lecture 10 - Anatomy of a Learning Algorithm

Slides

To do before class [Takes about 1 hour of prep at home]

Extra Material [Always to learn more but never needed for the course]




Lecture 11 - Clustering

Slides

To do before class [Takes about 1 hour of prep at home]

  • Talk on “Geodemographics and the Internal Structure of Cities” by Prof. Alex Singleton (link to 50min. video).

Extra Material [Always to learn more but never needed for the course]

  • Chapters 1 and 2 in (Webber & Burrows, 2018) provides a fascinating account of the origins of Geodemographic classifications.
  • Chapter 7 in (Brunsdon & Singleton, 2015): Geodemographic Analysis, by Alexandros Alexiou and Alex Singleton.
  • (Duque, Ramos, & Suriñach, 2007) is an excellent review of regionalisation algorithms, but it is an excellent read.
  • (Oke et al., 2019) provides a comprehensive urban framework using hierarchical clustering methods and diverse set of abundant data.

References

  1. Webber, R., & Burrows, R. (2018). The Predictive Postcode: The Geodemographic Classification of British Society. SAGE.
  2. Brunsdon, C., & Singleton, A. (2015). Geocomputation: A Practical Primer. SAGE.
  3. Duque, J. C., Ramos, R., & Suriñach, J. (2007). Supervised regionalisation methods: A survey. International Regional Science Review, 30(3), 195–220.
  4. Oke, J. B., Aboutaleb, Y. M., Akkinepally, A., Azevedo, C. L., Han, Y., Zegras, P. C., … & Ben-Akiva, M. E. (2019). A novel global urban typology framework for sustainable mobility futures. Environmental Research Letters, 14(9), 095006.



Lecture 12 - Dimensionality Reduction

Slides

To do before class [Takes about 1 hour of prep at home]

  • Read through this excellent step-wise example of Principal Component Analysis using airport delay data
  • Read this excellent community-driven explanation of PCA on StackExchange.

Extra Material [Always to learn more but never needed for the course]

The contents of this lecture are loosely based on, and explored into further detail, in the following reference :




Lecture 13 - Spatial Density Estimation

Slides

To do before class [Takes about 1 hour of prep at home]

  • Lecture on “Point Pattern Analysis Basics” by Luc Anselin (link to 45min video, and link to a more recent 6 min intro).

Extra Material [Always to learn more but never needed for the course]

  • This class was partially based on (Rey, 2015).
  • The slides for this lecture were also inspired by Part 6 in (C. Brunsdon, 2015).

References

  1. Rey, S. (2015). Point Pattern Basics. In GPH471: Geographic Information Analysis. Lecture slides from a course taught at Arizona State University.
  2. C. Brunsdon, L. C. (2015). An Introduction to R for Spatial Analysis and Mapping. SAGE Publications Ltd.



Lecture 14 - Responsible Data Science

Slides

To do before class [Takes about 1 hour of prep at home]

  • Read A city is not a computer, Shannon Mattern which carefully examines the limitations of computation in bettering the human condition.
  • Explore the Gender Shades project by Joy Buolamwini and Timnit Gebru that uncovers the priorities, preferences and prejudices of influential organisations that develop automated systems.

Extra Material [Always to learn more but never needed for the course]

References

  1. El-Geneidy, A., Levinson, D., Diab, E., Boisjoly, G., Verbich, D., & Loong, C. (2016). The cost of equity: Assessing transit accessibility and social disparity using total travel cost. Transportation Research Part A: Policy and Practice, 91, 302-316.