SQL introductions
- SQL for data analysis is an introduction to SQL for Python data analysts. It compares SQL's capabilities with Pandas' and shows how and when to use which of these two technologies.
- From the book named “SQL for data scientists” there is an editor to test your SQL queries online.
Less serious, but more funny approaches to learning SQL are the three gamified SQL resources * Interactive beginner friendly tutorial * Text adventure to learn SQL * Game to learn SQLo
SQL helpers
JupySQL is SQL in Jupyter notebooks. It includes some plotting functionality and pandas integration. So you can write some SQL to load your data into Pandas DataFrames. But it also supports DuckDB.
DuckDB in JupyterLab. DuckDB itself is a single user one node database for data engineers who do not want to write Pandas or Polars code, but are happy to solve their analysis and data mangling issues with SQL. It is very fast and quite flexible.
Speaking of DuckDB and comparing it with Pandas, polars and alike, there is a benchmark of database-like ops from 2021.
In Julia Evans' awsome list of playgrounds there are also some SQL playgrounds included of which one is her own.
Hints and tips for SQL
- PostgeSQL:
# \d my_table
for describing the database scheme - Postgres performance analysis
Python specific
SQLtap) is a profiler for SQLAlchemy to see where your real bottlenecks are. SQLModel is a library on top of SQLAlchemy and Pydantic abtracting the SQL away from your Python code.
Further readings
Also have a look at the article about dataframes.
If you want to have a look at your data inside your terminal, you might be interested in the ?q tools like jq and have a look at nushell (which I haven't, yet).