Project Template

An outline for writing a data science article

Before starting to write the final article for this course, I suggest you read A guide to writing scientific text. The final article should be a list of bullet points instead of paragraphs 1 following the outline below.


Please include,

  • a suitable title
  • your names
  • student numbers

Abstract (250 words or less)

  • State what we know about the subject/problem/big picture
  • Explain what we don’t know, which is basically a larger theoretical or practical puzzle or gap in the literature
  • State or describe your research question and explain how you answer that question
  • Briefly describe the data that you use to answer your research question
  • State what you find
  • If necessary, describe what these findings suggest about the answer to your research question
  • Explain why these findings are important


  • Describe the puzzle or gap in the literature that you will address with your data
    • What do we know?
    • What do we not know?
    • What will you tell us?
  • Identify your research question and explain how you answer it
    • What question will you answer?
    • What data will you use to answer this question?
    • What do you find?
  • Explain the importance of your findings
    • What is the answer to your research question?
    • How does this answer broaden, clarify, or challenge existing knowledge/theories?
  • Restate the puzzle or gap in the literature that you will address
  • Explain why this puzzle or gap is important to address
  • Describe (in more detail than in the intro) what we know about this topic/issue
  • Describe (in more detail than in the intro) what we do not know about this topic/issue
  • State your research question (i.e., “In this article, we investigate…”)
  • Explain how your research question solves the puzzle or fills the gap in the literature (i.e., “Answering this question allows me to…”)

*Note: The point of a literature review is not actually to review all of the relevant literature paragraph by paragraph. The point is to make the case for why your study is important and the literature summaries and refrences are there to justify your work.

Exploratory Data Analysis

  • Provide a brief overview of the study: an outline that tells the reader how all of the following methods and analysis will be connected.
  • Start by describing all the different data you will be using and answer some of the following questions
    • why you chose it
    • how you gained access to it
    • describe the research site, which city, which infrastructure and/or populations you are studying
  • If the data is collected over time or through experiments/scraping, describe the setup you created or utilised and the data you collected.
  • Describe your process to interrogate and analyse the data (i.e., how you cleaned, and prepared the data, analysed it, and modeled it).
  • Describe the limitations of your data and study (i.e., explain how your study is limited by your data and methodological choices)
    • Discuss the source of errors or missing values in your data and how you addressed those limitations (it is okay to acknowledge something is broken and describe ways in which some of the it has been addressed in your study).

Analysis (Only what is necessary and sufficient to address your research question)

  • State your argument and experimental design (i.e., the methods used to model data. For example, did you cluster your data spatially or infer what variables give the most information about a certain variable of interest using a supervised learning algorithm?)
    • carefully explain your choice of modelling.
  • Identify 2-3 supporting points – how your data and output of modelling support your argument
  • Identify 2-3 patterns in the data that provide evidence for each supporting point
  • For each pattern: (use figures wherever required as evidence and insight into your data)
    • Describe an example from your data that typifies this pattern
    • Provide a brief excerpt for any outliers in your data
    • Briefly explain how this example represents the larger pattern
    • Briefly explain how this pattern provides evidence for the supporting point

*Note: Everything that you include in your analysis should directly support your argument, and that argument should be the answer to your research question. A clear structure (with topic sentences and transitions) is very important for writing an analysis that meets this goal. If you did 500 things that are not relevant for your research question because you did not learn anything from it, do not include those observations.


  • Summarize your findings
    • Remind readers of the puzzle/gap in the literature that you are trying to solve
    • Remind readers of the specific research question that you have addressed
    • Briefly review what you found
    • Briefly explain what these findings imply about the answer to your research question
  • Discuss the implications of your findings
    • Explain how your findings solve the puzzle or fill the gap in the literature
    • Explain how the resolution of this gap/puzzle helps to clarify, challenge, or expand existing knowledge or theory
    • Using existing literature, explain why your findings are or are not surprising
  • Identify possible explanations for your findings
    • Use existing research to discuss the most likely explanation for your findings
    • Consider alternative explanations for your findings and explain (using your data and/or other research) why these alternative explanations do or do not seem plausible
  • Conclude by reviewing why these findings (and the larger puzzle/gap they address) are important


A list of references in APA format

  1. Note: I am not interested in assessing your writing skills in English. The assessment of this course will evaluate your ability to obtain, scrub, explore, model, interpret and communicate your findings coherently↩︎