💻 https://github.com/zachvalenta/query-sandbox

VISIDATA

  • grabbing subset of records
  • figuring out what columns matter

CLI

  • dbcli: snippets
  • duckdb: agg

TERMINAL

  • Polars
  • uniplot for viz

TOOLING

GUI

  • DB Browser

TUI

I use Visidata a lot.

DataGrip Harlequin looked promising but the

notebook

I don't default to notebooks. They feel slow. I don't use sophisticated enough charting to feel the pain of not using them. They encourage bad code. I hear they're not reproducible.

Jupyter is entrenched [Quarto, Great Tables], Marimo looks promising.

I use the Jupyter extension for VS Code + Data Wrangler.

🔢 DATA

SQL

IN BROWSER SQL

  • https://github.com/Spyyy004/SQLPremierLeague-Frontend
  • https://selectstarsql.com
  • https://pgexercises.com/gettingstarted.html
  • https://sql-playground.wizardzines.com/
  • https://www.crunchydata.com/developers/tutorials https://sqlime.org/

tools

DATABASES

  • DuckDB
  • SQLite
  • Postgres

CLI

  • iPython
  • litecli
  • pgcli
  • sqlite-utils

TUI

  • visidata
  • Datasette

DATAFRAMES

  • Polars
  • Pandas

sets

🗄️ analytics.md canonical

  • Spotify 2024 https://www.kaggle.com/datasets/nelgiriyewithana/most-streamed-spotify-songs-2024 https://huggingface.co/datasets/vishnupriyavr/spotify-million-song-dataset
  • Great Tables https://posit-dev.github.io/great-tables/reference/#built-in-datasets
  • housing https://www.zillow.com/research/data/
  • Sakila https://github.com/jOOQ/sakila