This page leverages some basic transform syntax from later parts of the walkthrough. Don’t worry too much about it for now: the core purpose of this page is to understand how relations work in Vinyl.

Properties

As in SQL, relations are the core data model. But relations in Vinyl have some unique properties:

Relations are column aware

Take the seattle_weather dataset from vinyl’s examples.

from vinyl.examples import seattle_weather

weather = seattle_weather()

weather is a Vinytable object and carries far more information that a standard sql table or cte. The schema, for example, can be pulled easily without running a query against the database.

print(weather.schema())
┏━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓ ┃ # column type ┃ ┡━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩ │ 0 │ date │ timestamp(6) │ │ 1 │ precipitation │ float64 │ │ 2 │ temp_max │ float64 │ │ 3 │ temp_min │ float64 │ │ 4 │ wind │ float64 │ │ 5 │ weather │ string │ └───┴───────────────┴──────────────┘

Columns are attrributes of the class, so the code below returns the wind column.

weather.wind
r0 := DatabaseTable: seattle_weather
  date          timestamp(6)
  precipitation float64
  temp_max      float64
  temp_min      float64
  wind          float64
  weather       string

wind: r0.wind

And, after a basic transform:

wind_doubled = weather.select(weather.wind * 2)
wind_doubled
r0 := DatabaseTable: seattle_weather
  date          timestamp(6)
  precipitation float64
  temp_max      float64
  temp_min      float64
  wind          float64
  weather       string

Project[r0]
  Multiply(wind, 2): r0.wind * 2

This means you can access the information at any point in your queries. For example

temps = weather.select([col for col in weather.columns if col.startswith("temp")])
print(temps.schema())
┏━━━┳━━━━━━━━━━┳━━━━━━━━━┓ ┃ # column type ┃ ┡━━━╇━━━━━━━━━━╇━━━━━━━━━┩ │ 0 │ temp_max │ float64 │ │ 1 │ temp_min │ float64 │ └───┴──────────┴─────────┘

will select the temperature columns from weather.

Relations are lazy

Vinyl keeps track of its syntax lazily, and only compiles when a variable is executed or its sql is generated.

By default, vinyl will return a string representation of the ast associated with a table unless specified otherwise. For example:

temps
r0 := DatabaseTable: seattle_weather
  date          timestamp(6)
  precipitation float64
  temp_max      float64
  temp_min      float64
  wind          float64
  weather       string

Project[r0]
  temp_max: r0.temp_max
  temp_min: r0.temp_min

If you’d like to see a graphical representation, use .visualize():

temps.visualize()
╭──────────────────────╮ │ Project │ ╭──────────────────────▶│ temp_max: float64 │ │ │ temp_min: float64 │ │ ╰──────────────────────╯ │ ▲▲ │ ││ │ ││ │ ││ ╭─────────────────╮ ╭─────────────────╮ │ temp_max: Field │ │ temp_min: Field │ │ :: float64 │ │ :: float64 │ ╰─────────────────╯ ╰─────────────────╯ ▲ ▲│ │ ││ │ ││ │ ││ │ ││ │ ╭────────────────────────────────╮ │ │ seattle_weather: DatabaseTable │ │ │ date: timestamp(6) │ │ │ precipitation: float64 │ ╰──────────────────│ temp_max: float64 │ │ temp_min: float64 │ │ wind: float64 │ │ weather: string │ ╰────────────────────────────────╯

To execute this, you only need to run:

temps.execute()
temp_maxtemp_min
012.85.0
110.62.8
211.77.2
312.25.6
48.92.8
14564.41.7
14575.01.7
14587.20.6
14595.6-1.0
14605.6-2.1

1461 rows × 2 columns

By default, this returns a pandas DataFrame. You can also return a text or pyarrow representation by specifying that as in the examples below:

temps.execute("text")
┏━━━━━━━━━━┳━━━━━━━━━━┓ ┃ temp_max temp_min ┃ ┡━━━━━━━━━━╇━━━━━━━━━━┩ │ float64float64 │ ├──────────┼──────────┤ │ 12.85.0 │ │ 10.62.8 │ │ 11.77.2 │ │ 12.25.6 │ │ 8.92.8 │ │ 4.42.2 │ │ 7.22.8 │ │ 10.02.8 │ │ 9.45.0 │ │ 6.10.6 │ │ │ └──────────┴──────────┘

temps.execute("pyarrow")
pyarrow.Table
temp_max: double
temp_min: double
----
temp_max: [[12.8,10.6,11.7,12.2,8.9,...,4.4,5,7.2,5.6,5.6]]
temp_min: [[5,2.8,7.2,5.6,2.8,...,1.7,1.7,0.6,-1,-2.1]]

You can also execute it and save the results to various formats (csv, json, etc.) using .save().

Relations are selectively mutable

By default, VinylTables are immutable. For example, the original table ast is printed when run is the same as the original weather.

weather.select(weather.wind * 2)
weather
DatabaseTable: seattle_weather
  date          timestamp(6)
  precipitation float64
  temp_max      float64
  temp_min      float64
  wind          float64
  weather       string

That said, there are two key cases where VinylTables are considered mutable. This allows for a more ergonomic syntax, especially when you are chaining several transforms together.

The two cases are: 1. Within specially decorated functions (i.e. those with @model or @metric decorator) 2. Context managers

The first case is designed to support :

from vinyl import T, model


@model(deps=[seattle_weather])
def wind(w: T) -> T:
    w.select(w.wind * 2)
    return w


wind()
r0 := DatabaseTable: seattle_weather
  date          timestamp(6)
  precipitation float64
  temp_max      float64
  temp_min      float64
  wind          float64
  weather       string

Project[r0]
  Multiply(wind, 2): r0.wind * 2

The second use case is designed primarily for analysis use cases (i.e. in a notebook):

with seattle_weather() as w:
    w.select(w.wind * 2)

w
r0 := DatabaseTable: seattle_weather
  date          timestamp(6)
  precipitation float64
  temp_max      float64
  temp_min      float64
  wind          float64
  weather       string

Project[r0]
  Multiply(wind, 2): r0.wind * 2

This approach can also be used inside vinyl pipeline functions to create a sort of “cte”.

True CTEs don’t exist in vinyl. In immutable mode, you simply need to save a subcalculation to a variable name to achieve the same functionality. But in mutable mode, as in a pipeline function, saving subcalculations as variables can cause issues, because it may have off-target effects on the sources of the subcalculation.

This is where the context manager comes in handy. It allows you to create a temporary variable that is only available within the context of the context manager.

In the example below, we get the average wind by month and then use it to calculate the wind speed deviation from the average.

This can be done more efficiently in vinyl using a window function, but this is a good example of how to use the context manager.

from vinyl import T, model, join


@model(deps=[seattle_weather])
def wind_vs_avg(w: T) -> T:
    with w as mw:
        mw.aggregate(
            cols = {"avg_wind": mw.wind.mean()}, 
            by = {"month": mw.date.dt.floor(months=1)}
        )

    
    j = join(w, mw, on = [w.date.dt.floor(months=1) == mw.month])
    j.select([w.date, w.wind - mw.avg_wind])

    return j

wind_vs_avg().execute("text")
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ date Subtract(wind, avg_wind) ┃ ┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ timestamp(6)float64 │ ├─────────────────────┼──────────────────────────┤ │ 2012-01-01 00:00:000.8 │ │ 2012-01-02 00:00:000.6 │ │ 2012-01-03 00:00:00-1.6 │ │ 2012-01-04 00:00:000.8 │ │ 2012-01-05 00:00:002.2 │ │ 2012-01-06 00:00:00-1.7 │ │ 2012-01-07 00:00:00-1.6 │ │ 2012-01-08 00:00:00-1.9 │ │ 2012-01-09 00:00:00-0.5 │ │ 2012-01-10 00:00:00-0.5 │ │ │ └─────────────────────┴──────────────────────────┘

Relations are dialect-independent

Vinyl uses the Ibis library to generate SQL. This means that you can write your queries in a dialect agnostic way. For example, the following code works in across the dialects currently supported by Vinyl:

  • BigQuery
  • Snowflake
  • DuckDB
  • Postgres

Ibis itself supports almost 20 dialects, so we plan to add more over time.

As an example, let’s create a table that lists the max temperature of each day by month:

with seattle_weather() as w:
    w.aggregate(
        cols = {"temp_max": w.temp_max.collect()}, 
        sort = {"month": w.date.dt.floor(months=1)}
    )

temp_by_month = w
temp_by_month.execute("text")
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓ ┃ month temp_max ┃ ┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩ │ timestamparray<float64> │ ├─────────────────────┼────────────────┤ │ │ [ │ │ │ 12.8, │ │ │ 10.6, │ │ │ 11.7, │ │ │ 12.2, │ │ │ 8.9, │ │ │ 4.4, │ │ │ 7.2, │ │ │ 10.0, │ │ │ 9.4, │ │ │ 6.1, │ │ │ 6.1, │ │ │ 6.1, │ │ │ 5.0, │ │ │ 4.4, │ │ │ 1.1, │ │ 2012-01-01 00:00:001.7, │ │ │ 3.3, │ │ │ 0.0, │ │ │ -1.1, │ │ │ 7.2, │ │ │ 8.3, │ │ │ 6.7, │ │ │ 8.3, │ │ │ 10.0, │ │ │ 8.9, │ │ │ 8.9, │ │ │ 6.7, │ │ │ 6.7, │ │ │ 9.4, │ │ │ 8.3, │ │ │ 9.4 │ │ │ ] │ │ │ [ │ │ │ 8.9, │ │ │ 8.3, │ │ │ 14.4, │ │ │ 15.6, │ │ │ 13.9, │ │ │ 16.1, │ │ │ 15.6, │ │ │ 10.0, │ │ │ 11.1, │ │ │ 12.8, │ │ │ 8.9, │ │ │ 8.3, │ │ │ 7.2, │ │ │ 6.7, │ │ 2012-02-01 00:00:007.2, │ │ │ 7.2, │ │ │ 10.0, │ │ │ 6.7, │ │ │ 6.7, │ │ │ 7.8, │ │ │ 10.0, │ │ │ 10.0, │ │ │ 8.3, │ │ │ 6.7, │ │ │ 7.2, │ │ │ 5.0, │ │ │ 6.7, │ │ │ 6.7, │ │ │ 5.0 │ │ │ ] │ │ │ [ │ │ │ 6.1, │ │ │ 6.7, │ │ │ 12.2, │ │ │ 10.6, │ │ │ 7.8, │ │ │ 6.7, │ │ │ 8.9, │ │ │ 15.6, │ │ │ 9.4, │ │ │ 7.2, │ │ │ 6.7, │ │ │ 8.3, │ │ │ 5.6, │ │ │ 7.8, │ │ │ 11.1, │ │ 2012-03-01 00:00:008.9, │ │ │ 10.0, │ │ │ 5.0, │ │ │ 7.2, │ │ │ 7.8, │ │ │ 8.9, │ │ │ 10.0, │ │ │ 12.2, │ │ │ 15.0, │ │ │ 13.3, │ │ │ 12.8, │ │ │ 14.4, │ │ │ 10.6, │ │ │ 10.0, │ │ │ 9.4, │ │ │ 10.0 │ │ │ ] │ │ │ [ │ │ │ 8.9, │ │ │ 16.7, │ │ │ 11.7, │ │ │ 10.6, │ │ │ 9.4, │ │ │ 11.1, │ │ │ 16.1, │ │ │ 21.1, │ │ │ 20.0, │ │ │ 17.8, │ │ │ 11.1, │ │ │ 13.9, │ │ │ 15.0, │ │ │ 15.6, │ │ 2012-04-01 00:00:0016.1, │ │ │ 13.3, │ │ │ 10.0, │ │ │ 13.3, │ │ │ 13.9, │ │ │ 13.3, │ │ │ 20.0, │ │ │ 23.3, │ │ │ 21.7, │ │ │ 13.9, │ │ │ 16.7, │ │ │ 13.9, │ │ │ 13.3, │ │ │ 16.1, │ │ │ 15.6, │ │ │ 12.8 │ │ │ ] │ │ │ [ │ │ │ 11.7, │ │ │ 13.3, │ │ │ 11.1, │ │ │ 12.2, │ │ │ 13.3, │ │ │ 17.8, │ │ │ 23.9, │ │ │ 18.3, │ │ │ 13.3, │ │ │ 14.4, │ │ │ 18.3, │ │ │ 24.4, │ │ │ 25.6, │ │ │ 26.7, │ │ │ 24.4, │ │ 2012-05-01 00:00:0019.4, │ │ │ 17.8, │ │ │ 15.6, │ │ │ 19.4, │ │ │ 14.4, │ │ │ 16.7, │ │ │ 12.8, │ │ │ 14.4, │ │ │ 17.2, │ │ │ 22.2, │ │ │ 22.2, │ │ │ 17.2, │ │ │ 16.7, │ │ │ 16.1, │ │ │ 18.9, │ │ │ 17.8 │ │ │ ] │ │ │ [ │ │ │ 20.0, │ │ │ 18.9, │ │ │ 17.2, │ │ │ 12.8, │ │ │ 13.3, │ │ │ 16.1, │ │ │ 16.1, │ │ │ 15.0, │ │ │ 17.2, │ │ │ 18.9, │ │ │ 23.3, │ │ │ 18.3, │ │ │ 16.1, │ │ │ 17.2, │ │ 2012-06-01 00:00:0022.2, │ │ │ 21.1, │ │ │ 18.9, │ │ │ 17.2, │ │ │ 19.4, │ │ │ 24.4, │ │ │ 23.9, │ │ │ 13.9, │ │ │ 15.6, │ │ │ 19.4, │ │ │ 19.4, │ │ │ 18.3, │ │ │ 22.8, │ │ │ 22.2, │ │ │ 21.7, │ │ │ 20.0 │ │ │ ] │ │ │ [ │ │ │ 20.0, │ │ │ 18.9, │ │ │ 18.3, │ │ │ 20.6, │ │ │ 24.4, │ │ │ 25.0, │ │ │ 26.7, │ │ │ 28.3, │ │ │ 25.0, │ │ │ 23.9, │ │ │ 27.8, │ │ │ 25.6, │ │ │ 23.3, │ │ │ 25.0, │ │ │ 18.9, │ │ 2012-07-01 00:00:0026.1, │ │ │ 21.7, │ │ │ 21.1, │ │ │ 25.0, │ │ │ 19.4, │ │ │ 23.9, │ │ │ 20.6, │ │ │ 18.9, │ │ │ 23.3, │ │ │ 26.7, │ │ │ 25.6, │ │ │ 18.9, │ │ │ 22.2, │ │ │ 22.8, │ │ │ 19.4, │ │ │ 22.8 │ │ │ ] │ │ │ [ │ │ │ 23.9, │ │ │ 23.3, │ │ │ 27.2, │ │ │ 33.9, │ │ │ 33.9, │ │ │ 28.3, │ │ │ 21.1, │ │ │ 22.2, │ │ │ 24.4, │ │ │ 25.6, │ │ │ 28.3, │ │ │ 30.6, │ │ │ 30.6, │ │ │ 28.9, │ │ │ 31.1, │ │ 2012-08-01 00:00:0034.4, │ │ │ 32.8, │ │ │ 21.7, │ │ │ 23.3, │ │ │ 25.6, │ │ │ 23.3, │ │ │ 22.2, │ │ │ 21.1, │ │ │ 22.2, │ │ │ 26.1, │ │ │ 21.1, │ │ │ 23.9, │ │ │ 22.8, │ │ │ 22.8, │ │ │ 22.8, │ │ │ 22.2 │ │ │ ] │ │ │ [ │ │ │ 21.7, │ │ │ 21.1, │ │ │ 22.8, │ │ │ 24.4, │ │ │ 26.1, │ │ │ 28.3, │ │ │ 32.2, │ │ │ 25.0, │ │ │ 18.9, │ │ │ 20.0, │ │ │ 20.0, │ │ │ 22.2, │ │ │ 27.8, │ │ │ 26.1, │ │ 2012-09-01 00:00:0022.2, │ │ │ 24.4, │ │ │ 27.8, │ │ │ 27.8, │ │ │ 23.9, │ │ │ 19.4, │ │ │ 16.1, │ │ │ 19.4, │ │ │ 19.4, │ │ │ 21.1, │ │ │ 19.4, │ │ │ 19.4, │ │ │ 22.8, │ │ │ 25.0, │ │ │ 20.6, │ │ │ 21.1 │ │ │ ] │ │ │ [ │ │ │ 23.3, │ │ │ 17.8, │ │ │ 18.9, │ │ │ 18.9, │ │ │ 21.7, │ │ │ 23.9, │ │ │ 23.9, │ │ │ 21.1, │ │ │ 16.1, │ │ │ 12.2, │ │ │ 13.9, │ │ │ 13.9, │ │ │ 15.6, │ │ │ 17.8, │ │ │ 17.2, │ │ 2012-10-01 00:00:0016.1, │ │ │ 14.4, │ │ │ 17.8, │ │ │ 15.0, │ │ │ 11.1, │ │ │ 11.7, │ │ │ 7.8, │ │ │ 11.1, │ │ │ 11.7, │ │ │ 11.7, │ │ │ 11.1, │ │ │ 14.4, │ │ │ 14.4, │ │ │ 15.6, │ │ │ 15.0, │ │ │ 15.6 │ │ │ ] │ │ │ └─────────────────────┴────────────────┘

Here’s how vinyl translates the query to each dialect:

SELECT
  `t1`.`month`,
  `t1`.`temp_max`
FROM (
  SELECT
    TIMESTAMP_TRUNC(`t0`.`date`, MONTH) AS `month`,
    ARRAY_AGG(`t0`.`temp_max` IGNORE NULLS) AS `temp_max`
  FROM `seattle_weather` AS `t0`
  GROUP BY
    1
) AS `t1`
ORDER BY
  `t1`.`month` ASC NULLS LAST

Creating relations

In the context of a project, relations are created and managed automatically using the source generation process.

In the context of a notebook or analysis, you can create relations from a variety of sources, including: - Pandas DataFrames - Polars DataFrames - PyArrow Tables - CSVs - JSONs - Parquet files

For in-memory sources, use VinylTable.from_memory as:

from vinyl import VinylTable
import pandas as pd


d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)

VinylTable.from_memory(df)
InMemoryTable
  data:
    PandasDataFrameProxy:
         col1  col2
      0     1     3
      1     2     4

And for file-based sources, use VinylTable.from_file:

VinylTable.from_file("data/barley.parquet")
DatabaseTable: barley
  yield   float64
  variety string
  year    int64
  site    string

Exploring relations

In the properties section action, you’ve already seen some of the most common ways to visualize relations:

  • print() to view the AST
  • .schema() to view the schema
  • .visualize() to view the query plan
  • .execute() to run the query and view it as text, pyarrow, or pandas objects
  • .to_sql() to view the SQL-equivalent

In addition to these, there are a few other key methods to know: - .eda() provides a quick exploratory data analysis of the relation, excluding its most common values - .chart() provides a grammar of graphics interface to quickly visualize the relation

Let’s view examples of each of these:

weather.eda(topk=2)
┏━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓ ┃ name type nullable nulls non_nulls null_frac values counts ┃ ┡━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩ │ stringstringbooleanint64int64float64array<string>array<int64> │ ├───────────────┼──────────────┼──────────┼───────┼───────────┼───────────┼────────────────────────────────────────────────┼──────────────┤ │ date timestamp(6) │ True │ 014610.0['2012-02-05 00:00:00', '2012-02-14 00:00:00'][1, 1] │ │ precipitationfloat64 │ True │ 014610.0['0.0', '0.3'][838, 54] │ │ temp_max float64 │ True │ 014610.0['11.1', '14.4'][58, 49] │ │ temp_min float64 │ True │ 014610.0['6.1', '10.0'][66, 64] │ │ wind float64 │ True │ 014610.0['2.6', '3.0'][76, 65] │ │ weather string │ True │ 014610.0['sun', 'fog'][714, 411] │ └───────────────┴──────────────┴──────────┴───────┴───────────┴───────────┴────────────────────────────────────────────────┴──────────────┘
r0 := DatabaseTable: seattle_weather
  date          timestamp(6)
  precipitation float64
  temp_max      float64
  temp_min      float64
  wind          float64
  weather       string

r1 := Aggregate[r0]
  groups:
    weather: r0.weather
  metrics:
    Count(weather): Count(r0.weather)

r2 := Aggregate[r0]
  groups:
    wind: r0.wind
  metrics:
    Count(wind): Count(r0.wind)

r3 := Aggregate[r0]
  groups:
    temp_min: r0.temp_min
  metrics:
    Count(temp_min): Count(r0.temp_min)

r4 := Aggregate[r0]
  groups:
    temp_max: r0.temp_max
  metrics:
    Count(temp_max): Count(r0.temp_max)

r5 := Aggregate[r0]
  groups:
    date: r0.date
  metrics:
    Count(date): Count(r0.date)

r6 := Aggregate[r0]
  groups:
    precipitation: r0.precipitation
  metrics:
    Count(precipitation): Count(r0.precipitation)

r7 := Project[r0]
  isna: SearchedCase(cases=[IsNull(r0.weather)], results=[1], default=0)

r8 := Project[r0]
  isna: SearchedCase(cases=[IsNull(r0.wind)], results=[1], default=0)

r9 := Project[r0]
  isna: SearchedCase(cases=[IsNull(r0.temp_min)], results=[1], default=0)

r10 := Project[r0]
  isna: SearchedCase(cases=[IsNull(r0.temp_max)], results=[1], default=0)

r11 := Project[r0]
  isna: SearchedCase(cases=[IsNull(r0.date)], results=[1], default=0)

r12 := Project[r0]
  isna: SearchedCase(cases=[IsNull(r0.precipitation)], results=[1], default=0)

r13 := Sort[r1]
  desc r1['Count(weather)']

r14 := Sort[r2]
  desc r2['Count(wind)']

r15 := Sort[r3]
  desc r3['Count(temp_min)']

r16 := Sort[r4]
  desc r4['Count(temp_max)']

r17 := Sort[r5]
  desc r5['Count(date)']

r18 := Sort[r6]
  desc r6['Count(precipitation)']

r19 := Limit[r13, n=2]

r20 := Limit[r14, n=2]

r21 := Limit[r15, n=2]

r22 := Limit[r16, n=2]

r23 := Limit[r17, n=2]

r24 := Limit[r18, n=2]

r25 := Aggregate[r7]
  metrics:
    pos:       5
    name:      'weather'
    type:      'string'
    nullable:  True
    nulls:     Sum(r7.isna)
    non_nulls: Sum(1 - r7.isna)
    null_frac: Mean(r7.isna)

r26 := Aggregate[r8]
  metrics:
    pos:       4
    name:      'wind'
    type:      'float64'
    nullable:  True
    nulls:     Sum(r8.isna)
    non_nulls: Sum(1 - r8.isna)
    null_frac: Mean(r8.isna)

r27 := Aggregate[r9]
  metrics:
    pos:       3
    name:      'temp_min'
    type:      'float64'
    nullable:  True
    nulls:     Sum(r9.isna)
    non_nulls: Sum(1 - r9.isna)
    null_frac: Mean(r9.isna)

r28 := Aggregate[r10]
  metrics:
    pos:       2
    name:      'temp_max'
    type:      'float64'
    nullable:  True
    nulls:     Sum(r10.isna)
    non_nulls: Sum(1 - r10.isna)
    null_frac: Mean(r10.isna)

r29 := Aggregate[r11]
  metrics:
    pos:       0
    name:      'date'
    type:      'timestamp(6)'
    nullable:  True
    nulls:     Sum(r11.isna)
    non_nulls: Sum(1 - r11.isna)
    null_frac: Mean(r11.isna)

r30 := Aggregate[r12]
  metrics:
    pos:       1
    name:      'precipitation'
    type:      'float64'
    nullable:  True
    nulls:     Sum(r12.isna)
    non_nulls: Sum(1 - r12.isna)
    null_frac: Mean(r12.isna)

r31 := Aggregate[r19]
  metrics:
    values: ArrayCollect(r19.weather)
    counts: ArrayCollect(r19['Count(weather)'])
    pos:    5

r32 := Aggregate[r20]
  metrics:
    values: ArrayCollect(Cast(r20.wind, to=string))
    counts: ArrayCollect(r20['Count(wind)'])
    pos:    4

r33 := Aggregate[r21]
  metrics:
    values: ArrayCollect(Cast(r21.temp_min, to=string))
    counts: ArrayCollect(r21['Count(temp_min)'])
    pos:    3

r34 := Aggregate[r22]
  metrics:
    values: ArrayCollect(Cast(r22.temp_max, to=string))
    counts: ArrayCollect(r22['Count(temp_max)'])
    pos:    2

r35 := Aggregate[r23]
  metrics:
    values: ArrayCollect(Cast(r23.date, to=string))
    counts: ArrayCollect(r23['Count(date)'])
    pos:    0

r36 := Aggregate[r24]
  metrics:
    values: ArrayCollect(Cast(r24.precipitation, to=string))
    counts: ArrayCollect(r24['Count(precipitation)'])
    pos:    1

r37 := JoinChain[r25]
  JoinLink[inner, r31]
    r25.pos == r31.pos
  values:
    pos:       r25.pos
    name:      r25.name
    type:      r25.type
    nullable:  r25.nullable
    nulls:     r25.nulls
    non_nulls: r25.non_nulls
    null_frac: r25.null_frac
    values:    r31.values
    counts:    r31.counts

r38 := JoinChain[r26]
  JoinLink[inner, r32]
    r26.pos == r32.pos
  values:
    pos:       r26.pos
    name:      r26.name
    type:      r26.type
    nullable:  r26.nullable
    nulls:     r26.nulls
    non_nulls: r26.non_nulls
    null_frac: r26.null_frac
    values:    r32.values
    counts:    r32.counts

r39 := JoinChain[r27]
  JoinLink[inner, r33]
    r27.pos == r33.pos
  values:
    pos:       r27.pos
    name:      r27.name
    type:      r27.type
    nullable:  r27.nullable
    nulls:     r27.nulls
    non_nulls: r27.non_nulls
    null_frac: r27.null_frac
    values:    r33.values
    counts:    r33.counts

r40 := JoinChain[r28]
  JoinLink[inner, r34]
    r28.pos == r34.pos
  values:
    pos:       r28.pos
    name:      r28.name
    type:      r28.type
    nullable:  r28.nullable
    nulls:     r28.nulls
    non_nulls: r28.non_nulls
    null_frac: r28.null_frac
    values:    r34.values
    counts:    r34.counts

r41 := JoinChain[r29]
  JoinLink[inner, r35]
    r29.pos == r35.pos
  values:
    pos:       r29.pos
    name:      r29.name
    type:      r29.type
    nullable:  r29.nullable
    nulls:     r29.nulls
    non_nulls: r29.non_nulls
    null_frac: r29.null_frac
    values:    r35.values
    counts:    r35.counts

r42 := JoinChain[r30]
  JoinLink[inner, r36]
    r30.pos == r36.pos
  values:
    pos:       r30.pos
    name:      r30.name
    type:      r30.type
    nullable:  r30.nullable
    nulls:     r30.nulls
    non_nulls: r30.non_nulls
    null_frac: r30.null_frac
    values:    r36.values
    counts:    r36.counts

r43 := Project[r37]
  name:      r37.name
  type:      r37.type
  nullable:  r37.nullable
  nulls:     r37.nulls
  non_nulls: r37.non_nulls
  null_frac: r37.null_frac
  values:    r37.values
  counts:    r37.counts

r44 := Project[r38]
  name:      r38.name
  type:      r38.type
  nullable:  r38.nullable
  nulls:     r38.nulls
  non_nulls: r38.non_nulls
  null_frac: r38.null_frac
  values:    r38.values
  counts:    r38.counts

r45 := Project[r39]
  name:      r39.name
  type:      r39.type
  nullable:  r39.nullable
  nulls:     r39.nulls
  non_nulls: r39.non_nulls
  null_frac: r39.null_frac
  values:    r39.values
  counts:    r39.counts

r46 := Project[r40]
  name:      r40.name
  type:      r40.type
  nullable:  r40.nullable
  nulls:     r40.nulls
  non_nulls: r40.non_nulls
  null_frac: r40.null_frac
  values:    r40.values
  counts:    r40.counts

r47 := Project[r41]
  name:      r41.name
  type:      r41.type
  nullable:  r41.nullable
  nulls:     r41.nulls
  non_nulls: r41.non_nulls
  null_frac: r41.null_frac
  values:    r41.values
  counts:    r41.counts

r48 := Project[r42]
  name:      r42.name
  type:      r42.type
  nullable:  r42.nullable
  nulls:     r42.nulls
  non_nulls: r42.non_nulls
  null_frac: r42.null_frac
  values:    r42.values
  counts:    r42.counts

r49 := Union[r47, r48, distinct=False]

r50 := Union[r49, r46, distinct=False]

r51 := Union[r50, r45, distinct=False]

r52 := Union[r51, r44, distinct=False]

r53 := Union[r52, r43, distinct=False]

Project[r53]
  name:      r53.name
  type:      r53.type
  nullable:  r53.nullable
  nulls:     r53.nulls
  non_nulls: r53.non_nulls
  null_frac: r53.null_frac
  values:    r53.values
  counts:    r53.counts

Charting relies on the awesome lets-plot library from JetBrains

from vinyl.chart import geom

weather.chart(geom.scatter, x=weather.date, y=weather.temp_max, interactive = False)
2012 2013 2014 2015 2016 0 5 10 15 20 25 30 35 temp_max date

Charts made with Vinyl are interactive by default, but we turn that feature off so our docs site can render them properly.