Progressive Data Resources
I have amassed many resources as I journeyed through my progressive data career. My studies have led me to dabble in everything from SQL, data engineering, warehousing, writing, and so much more. Now I lead data teams and support others in their analytics and engineering journeys. These are my favorite resources. I hope you find something useful here.
This is a living document and will be updated as I continue to find exciting talks, guides, tutorials, and other tidbits that can help you, too, be a progressive data champion.
This page is the companion piece to my blog "How to Land a Job in Progressive Data."
🔧 AI, LLMs, and the wild world we live in
I believe that AI will eventually change every aspect of the way we live, including and especially our work in political data. Whether you are skeptical about these technologies or are an early adapter, having a "no-hype" understanding of how these tools work, where they can go wrong, and how to leverage them safely in your work will be a key skill for data and tech practitioners over the next few years.
Below I outline some of my favorite resources on AI, and will continue to add as I continue my studies.
The Holy Grail Resource 🏆
The above-linked document is the best central resource for 1) understanding the math behind transformers and LLMs, 2) becoming adept at working with these technologies, 3) understanding their limitations and where this technology can go wrong, and 4) what engineers, governments, and policy officials can do to ensure AI alignment and safety.
Runner up: Ben Resnik's AI Reading List
Introduction
Visualization the deep learning revolution
To understand the velocity of this change.
Compute Trends Across Three Eras of Machine Learning
An increase in compute is why we have had this boom in AI.
Intro to Large Language Models
An overall fantastic lecture.
I love this interactive explainer that communicates the sheer velocity at which AI has developed over the last few years and why we should all be a little concerned.
Technical Deep Dive
Essence of Linear Algebra, by 3Blue1Brown
Transformers are mostly linear algebra, and there is no one better to learn from than 3Blue1Brown.
This is the white paper that started it all: the invention of transformer architecture.
This will get your hands dirty with the code to work with transformers.
What is a neural net, by 3Blue1Brown
You should watch the entirety of this multi-part series.
AI Safety
I am currently working through this course and find it a godsend. If you want an overview of how the technology in LLMs works, an introduction to AI safety, and some tools to be able to work on AI safety problems, I would highly recommend starting with this course.
These four claims explain why general artificial intelligence, which they believe will happen sooner rather than later, must be guided into being used for good.
I think this is the most well-done primer on AI safety that I have seen.
🔧 Technical Skills
SQL
A note on learning SQL; SQL is easy to pick up but deceptively complex. You can pick up the basic syntax of SQL in an afternoon by reading one of the SQL style guides below and working through the Mode SQL tutorial. As a hiring manager, the mark of someone who has progressed beyond the basics and on to intermediate and more advanced skill is someone who can wrangle messy data in SQL, write in clean, legible CTEs, and has an intuition for modeling complex data. You can only gain better SQL skills by writing more SQL and encountering harder and harder challenges, so I recommend opening a free BigQuery instance and playing around.
FAQ: How do I learn SQL if I don't have access to a data warehouse?
I would recommend going through all of the Mode SQL tutorials. Then, BigQuery has a hefty free tier. A good beginner project would be to upload some data to BigQuery, play around with it in SQL, and make a dashboard in Looker. You can show this project in interviews.
Mazur’s Style Guide (the one that most people use)
https://github.com/mattm/sql-style-guide
Brooklyn Data Co Style Guide (my preferred style guide - it's so opinionated and good)
https://github.com/brooklyn-data/co/blob/main/sql_style_guide.md
Mode SQL Tutorials
If you are just starting, begin here. It is the best beginner SQL tutorial on the Internet.
https://mode.com/sql-tutorial/
Mystery SQL Tutorial
https://mystery.knightlab.com/walkthrough.html
Aaron’s SQL Tutorials
https://github.com/ABZ-Aaron/SQL-Tutorials
Learn CTEs
https://learnsql.com/blog/cte-with-examples/
https://www.sqlservertutorial.net/sql-server-basics/sql-server-cte/
https://learnsql.com/blog/sql-subquery-cte-difference/
Terminal/Command Line
I expect everyone on my team to be comfortable with the command line. Not only is it a prerequisite for using git and dbt, but it will make you a better data analyst. Your more senior comrades and coworkers will expect you to be able to navigate your computer from a terminal and to be able to perform basic operations, such as making a director or opening a file. If you think the terminal is something out of a 1990s hacker movie, take a moment to work through these exercises.
Learning the Shell
https://linuxcommand.org/lc3_learning_the_shell.php
The Unix shell
https://swcarpentry.github.io/shell-novice/
Learn Enough Command Line to Be Dangerous
https://www.learnenough.com/command-line-tutorial/basics
Introduction to the Bash Command Line
https://programminghistorian.org/en/lessons/intro-to-bash
Launch School Command Line Book
https://launchschool.com/books/command_line/read/introduction
Terminal Cheat Sheet
https://gist.github.com/cferdinandi/ef665330286fd5d7127d
Viking Code School Command Line Crash Course
https://www.vikingcodeschool.com/web-development-basics/a-command-line-crash-course
Crash course
https://cglab.ca/~morin/teaching/1405/clcc/book/cli-crash-course.html
dbt
It is no secret that I am a dbt fan. I started using dbt when I was the Data Director of Sunrise and never went back. I refuse to work without it. I think more organizations and campaigns on the left should use this tool. The analytical modeling techniques that you learn while using dbt are worthwhile regardless of whether you actually implement dbt on your team.
What is dbt
https://www.getdbt.com/blog/what-exactly-is-dbt/
How to learn dbt > start here, dbt Fundamentals
https://courses.getdbt.com/courses/fundamentals
From worst to first, revamping your dbt project to be world-class
https://www.youtube.com/watch?v=v26R0EgNK44&ab_channel=dbt
* How we structure our dbt Projects
https://docs.getdbt.com/guides/best-practices/how-we-structure/1-guide-overview
Don’t nest your curlies
https://docs.getdbt.com/docs/building-a-dbt-project/dont-nest-your-curlies
Jinja docs
https://jinja.palletsprojects.com/en/2.11.x/
Python
Every data analyst should know SQL and one scripting language. Python is the best language to learn for its ease and versatility. With Python, you can move data between tools and databases, automate tedious tasks, and interact with APIs. See this lovely article by the GOAT, Michael Kaminsky.
RealPython
Use your professional development budget to buy a license. They have great Python tutorials.
LearnPython
Learn Python3, Codeacademy
https://www.codecademy.com/learn/learn-python-3
Scientific Computing with Python
https://www.freecodecamp.org/learn/scientific-computing-with-python/
What does Pythonic code look like?
https://docs.python-guide.org/writing/style/
Object Oriented Programming Tutorial
https://realpython.com/learning-paths/object-oriented-programming-oop-python/
Time complexity
https://wiki.python.org/moin/TimeComplexity?
Codewars
https://www.codewars.com/collections/python-test-1
Think Like a Computer Scientist
https://openbookproject.net/thinkcs/python/english3e/
Learn Python the Hard Way
https://learnpythonthehardway.org/book/
Data Engineering
I adamantly believe that most organizers over-index on data analysis and under-index on data engineering. Data engineering allows you to move data through your tech stack and into the hands of organizers. It's what allows you to bypass the limitations of your CRM to deliver quality, accurate data to your key stakeholders. Data engineering is one of the most in-demand skills in progressive politics. We don't need more dashboards; we need more robust, well-tested pipelines.
Austin Weisgrau's data engineering resources for the left
https://austinweisgrau.github.io/data-engineering-resources.html
Data engineering Zoom camp
I am in the process of going through this free engineering boot camp, but have heard good things and it is free.
https://github.com/DataTalksClub/data-engineering-zoomcamp
Parsons (open source engineering library for lefty/progressive tools)
https://github.com/move-coop/parsons
Intro to REST APIs
https://www.restapitutorial.com/
Getting Started With Testing in Python
https://realpython.com/python-testing/
Warehouses
Civis is not a data warehouse. Civis is a wrapper around a Redshift data warehouse that facilitates ease of use. Good engineers know about their warehouses, how data are stored, and how to extract better performance from them. If you use Redshift, you should read the Redshift documentation front to back. The same advice goes for BigQuery.
Michael Kaminsky's series on data warehouses
https://www.fivetran.com/people/michael-kaminsky
Michael is my favorite expert on data warehouses, and I find their teaching style to be really intuitive and easy to grok. Michael wrote several articles for FiveTran on data warehouses that I highly recommend.
Modeling data
This section is a blend of my SQL and dbt sections. Analytics engineers are expected to have strong data modeling skills - that is, the ability to take raw data and transform it into business-facing logic. For example, the raw tables loaded into your warehouse have little meaning to your Political Director, but with some data modeling, you can turn those raw tables into models that have valuable insight (and may be used for a report or dashboard). Below are a few of my favorite reads on data modeling.
What is an OLAP cube?
https://analyticsengineers.club/whats-an-olap-cube/
Dimensional modeling
https://docs.getdbt.com/terms/dimensional-modeling
Building a Kimball dimensional model with dbt
https://docs.getdbt.com/blog/kimball-dimensional-model
Kimball in the context of the modern data warehouse: what's worth keeping, and what's not
Slowly changing dimension
This is a fun data concept that I think every data analyst should know!
https://en.wikipedia.org/wiki/Slowly_changing_dimension#Type_2:_add_new_row
Git/GitHub
Version control is a must for advanced data teams. As a data professional, you may be expected to collaborate with your team via git and GitHub. If you don't have one already, I highly suggest making a GitHub account and showcasing some of your work in a portfolio repo. Here are some resources to help you master git.
git for the rest of us
https://www.youtube.com/watch?v=mGSecJDvtUQ&ab_channel=dbt
Claire Carroll is the GOAT and is the single best teacher of git that I've ever come across.
YouTube GitHub Walkthrough
https://www.youtube.com/watch?v=iv8rSLsi1xo
Git Immersion
https://gitimmersion.com/index.html
Git guide
https://github.com/dbt-labs/corp/blob/main/git-guide.md
How I build a Feature
https://simonwillison.net/2022/Jan/12/how-i-build-a-feature/
How to create a Pull Request
Git hand book
https://guides.github.com/introduction/git-handbook/
Introduction to git flow
https://guides.github.com/introduction/flow/
GitHub Hello World Tutorial
https://guides.github.com/activities/hello-world/
Documentation
Documenting Python Code - A Complete Guide
https://realpython.com/documenting-python-code/
How to Document Python Code
https://www.datacamp.com/tutorial/documenting-python-code
🔧 "Soft" Skills
Writing
My best advice on how to strengthen your technical writing skills is to write. The only way to become a better coder is to code, and the only way to become a better writer is to write. You will write a bunch of crap until you don't anymore. My blog started off as a way for me to overcome my writing anxiety--I forced myself to write by doing it in public, even if my writing wasn't very good. I encourage you to consider sharing what you know, no matter how small.
How Not to be a terrible writer video
Writing Handbook
https://www.julian.com/guide/write/intro
Writing about your work
https://docs.google.com/document/d/1xSzr6o34v2gpHyubtkF2L09L03wTZRb4ICB2GMgsp3c/edit?usp=sharing
Leadership
The Phoenix Project, book
A book about It leadership, but applies just as well to data and tech leaders.
An Intake Form for Data Requests
An Intake Form for Data Requests – Haystacks
How to Structure a Data Team
How to structure a data team - YouTube
AI Hierarchy of Needs
https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007
An Elegant Puzzle: Systems of Engineering Management
https://www.amazon.com/Elegant-Puzzle-Systems-Engineering-Management/dp/1732265186
The Manager's Path: A Guide for Tech Leaders Navigating Growth and Change
https://www.amazon.com/Managers-Path-Leaders-Navigating-Growth/dp/1491973897
One Analyst’s Guide from Going from Good to Great
https://www.getdbt.com/blog/one-analysts-guide-for-going-from-good-to-great/
🔧 Progressive Data
General
Inside the Cave - look at Obama Tech team
https://enga.ge/wp-content/uploads/2018/01/Inside_the_Cave-1.pdf
Bernie 2020 Data Team post mortem
https://docs.google.com/document/d/1WBvC2oxvMUTV65xW6cPMAcyX_qkpXaPK0dC0Q3wI4pE/edit
Resource on asking demographic questions
http://nikkistevens.com/open-demographics/
Jobs Boards
https://progressivedatajobs.org/
Loosely defined as a job where the majority of the role, department or organization/company is focused on data or analytics work in the progressive space. This may mean things such as managing databases, client or partner relationships, management, statistical analysis or certain tech roles, such as software engineers.
https://jobs.codeforamerica.org/
This job board is a collection of curated opportunities in public interest technology. We believe that people with design, product, analytical, and technical expertise can make a difference in how government works.
Movement Builders is a free service from People’s Action for sharing jobs, contract work, and internships (paid only, please) from organizations in our network and allied organizations.
https://careercenter.democraticgain.org//
Democratic Gain career center Search a wide variety of jobs in the political and non-profit space across the country.
A community for people of color in politics and activism. Joining gives you access to job postings, trainings, and personalized mentorship and resume review is available.
https://jobs.codeforamerica.org/
Hosted by Code for Progress, you can find tech jobs in government, non-profits that partner with governments and for-profits that have a strong focus on social impact.
Upload your resume and organizations and campaigns will reach out to you! This is a great talent recruiting site to help you get noticed by recruiters and hiring managers.
Guides & Resources
https://www.guide.progressivedatajobs.org/
This is the OG guide for folk looking to start their career in progressive analytics. If you want a good primer of what it takes to land a job in this field, look no further than this guide.
https://www.crackthecode.io/resources/
Great compilation on progressive data resources.
http://hellomartha.co/resources.html
A wonderful compilation of resources maintained by a talented engineer in the space.
Trainings
Analytics Engineers Club
https://analyticsengineers.club/
This is the best data training on the market. I have sent everyone I've managed through this course. I swear by it. And, if you message me, I may be able to provide you with a discount code (:
Arena Academy
Trains people to join electoral campaigns. Has a Data Director track.
rePower
Data and Analytics boot camp for campaigns and progressive tech. This is a great place to start if you're looking to get started in the field.
Generation Data
Generationdata.org
Another great institution that offers data and tech trainings.
List Servs
Progressphiles
Progressphiles, the largest online proresssive data community. You need two recommenders to join, and I can be one of them.
Data Ladies Alliance!
This is a listserv for women and non-conforming folks in progressive data and technology. We are committed to creating a safe space for women and non-binary individuals to discuss working with progressive and political data and technology. To nominate someone to join the list, ask brittany
JobsthatareLEFT
https://groups.google.com/forum/#!forum/jobsthatareleft
Volunteer Opportunities
Rag Tag
Ragtag is building a movement of technologists to amplify progressive organizing in innovative and high-impact ways.
Data Kind
Harnessing the power of data science in the service of humanity.
Code for America
https://www.codeforamerica.org/
Check your local Meetup
Prometheus
https://www.joinprometheus.com/for-talent
Misc
Progressive Data, Analytics, and Technology Salary Survey
https://www.progressivedatajobs.org/salary-survey/
This is an annual survey conducted among those currently working in Data, Analytics, Technology in Progressive & Democratic work spaces. The survey uses a broad definition of the industry and the fields and has included responses from people working in candidate campaigns and independent expenditures, issue advocacy, nonprofits, labor unions and more.
Data for Democracy
https://www.datafordemocracy.org/
Data for Democracy started as an examination in December 2016, when individuals from around the globe started to work together on data-related issues through Slack messages and GitHub submits. Without any principles or formal authoritative structure, the attention was on completing genuine and effective work insignificant deferral.
Flowing Data
FlowingData explores how we use analysis and visualization to understand data and ourselves.
The blog — a combination of highlighting others’ work, my own projects, and visualization guides — is a free resource for everyone. It’s completely supported by members, who get access to courses, tutorials, and The Process.
🔧 Books
While progressive data professionals need strong technical skills, they also need strong critical thinking skills. One of the ways to sharpen your intellect is by reading, and by reading a wide range of material. I present a handful of my favorite books. I have read or thoroughly skimmed every recommendation below.
Books on data ethics
Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy
https://www.goodreads.com/book/show/28186015-weapons-of-math-destruction
Algorithms of Oppression
https://nyupress.org/9781479837243/algorithms-of-oppression/
Automating Inequality
https://us.macmillan.com/books/9781250074317
Data Feminism (Strong Ideas)
https://www.amazon.com/gp/product/0262044005/ref=ppx_yo_dt_b_asin_image_o05_s02?ie=UTF8&psc=1
Books on data and race
Race After Technology: Abolitionist Tools for the New Jim Code
Constructing Race and Ethnicity in America
https://www.amazon.com/gp/product/0765608014/ref=ppx_yo_dt_b_asin_image_o05_s03?ie=UTF8&psc=1
What Is "Your" Race?: The Census and Our Flawed Efforts to Classify Americans
https://www.amazon.com/gp/product/0691157030/ref=ppx_yo_dt_b_asin_title_o06_s00?ie=UTF8&psc=1
Captivating Technology: Race, Carceral Technoscience, and Liberatory Imagination in Everyday Life
https://www.amazon.com/gp/product/1478003812/ref=ppx_yo_dt_b_asin_title_o07_s00?ie=UTF8&psc=1
Sorting Things Out: Classification and Its Consequences (Inside Technology
https://www.amazon.com/gp/product/0262522950/ref=ppx_yo_dt_b_asin_title_o05_s00?ie=UTF8&p
Counting Americans
https://www.amazon.com/gp/product/0190092475/ref=ppx_yo_dt_b_asin_title_o05_s01?ie=UTF8&psc=1
Books on data and politics
Prototype Politics: Technology-Intensive Campaigning and the Data of Democracy (Oxford Studies in Digital Politics)
https://www.amazon.com/gp/product/0199350256/ref=ppx_yo_dt_b_asin_title_o04_s00?ie=UTF8&psc=1
The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power
https://www.amazon.com/gp/product/1610395697/ref=ppx_yo_dt_b_asin_title_o05_s00?ie=UTF8&psc=1