About
About the NYC Open Data Lab
The NYC Open Data Lab is a public-facing initiative dedicated to transforming New York City’s open data into reproducible research, civic insights, and accessible public scholarship.
The Lab brings together teaching, tool development, and real-world analysis into a single, connected ecosystem. Using data from the NYC Open Data Portal, we focus not just on making data available—but on making it usable, understandable, and meaningful.
At its core, the Lab is built on a simple idea: data work should be transparent, reproducible, and publicly visible.
A Reproducible Approach to Civic Data
All work within the Lab is built using reproducible research workflows in R and Quarto. Each project integrates code, data, and narrative into a single, cohesive document—ensuring that analyses are not only readable, but also verifiable and reusable.
This approach allows others to:
- understand how results were produced
- replicate findings
- extend existing work
Reproducibility is not treated as an add-on—it is the foundation of the Lab’s design.
From Coursework to Public Scholarship
A central component of the Lab is the transformation of student work into public-facing research.
Graduate students develop original projects using NYC Open Data, which are then refined, published, and shared through the Lab. These projects are designed to be:
- Public-facing
- Reproducible
- Portfolio-ready
Rather than disappearing at the end of a semester, student work becomes part of a growing body of civic data research—contributing to ongoing conversations about New York City.
Tools for Open Data Access
The Lab also develops open-source tools to support reproducible data workflows.
One example is the nycOpenData R package, which provides a streamlined interface for accessing datasets from the NYC Open Data Portal. By simplifying interactions with open data APIs, the package allows users to focus on analysis and insight rather than data retrieval.
This work extends beyond a single package, contributing to a broader ecosystem of tools designed to make open data more accessible across cities and contexts.
An Integrated Ecosystem
- The NYC Open Data Lab connects multiple components into a unified system:
- NYC Open Data Stories — a public-facing collection of civic data analyses
- Student research and publications — developed through coursework and published openly
- Open-source tools — supporting reproducible access to public data
- Teaching and OER — designed around end-to-end reproducible workflows
- Public engagement — through conferences, writing, and civic data events
Each piece reinforces the others, creating a model where teaching, research, and public scholarship are fully integrated.
Vision
The NYC Open Data Lab is part of a broader effort to rethink how data work is taught, shared, and sustained.
The goal is to build a model where:
- data projects are not temporary
- student work contributes to public knowledge
- tools lower barriers to entry
- research is open, reproducible, and accessible
By connecting education with real-world data and public output, the Lab aims to create a sustainable and scalable approach to civic data science.
Collaborating
Interested in collaborating, teaching, or speaking?