aggregate

Returns an aggregated table for cols, grouped by by and sort.

If fill is specified, the table will be interpolated using the specified fill strategy, taking into account direction from the sort argument. fill can either be a single value or a list of values, one for each column in cols.


aggregate_all

Aggregate_all is a generalized form of aggregate that can apply apply the same operation (specified in f) to multiple columns. The col_selector field can be a list of column fields, where each element select, and f should be a list of functions of the same length.

If f is a single function, it will be applied to all columns. If f is a list of functions, the functions will be applied to the corresponding columns. If f is shorter than the number of columns, the last function will be applied to all remaining columns.

By, sort, and fill operate as in aggregate.

If rename is True, the columns will be renamed to the name of the function that was applied to them. If rename is False, the columns will names to the original column name.


chart

Visualize the table using a chart. The geoms argument should be a geom or a list of geoms. The x, y, color, fill, size, alpha, and facet arguments should be column expressions. If a list of columns is passed, the chart will be faceted by the columns in the list.

If coord_flip is True, the x and y axes will be flipped.


copy

None


count

Return the count of rows in the table.


define

Mutate is identical to select, except all current columns are included, and the new columns are added to the table. If a new column has the same name as an existing column, the existing column will be replaced.


define_all

Mutate_all is identical to select_all, except all current columns are included, and the new columns are added to the table. If a new column has the same name as an existing column, the existing column will be replaced.


distinct

Return distinct rows from the table.

If on is specified, the distinct rows will be based on the columns in on. If it is not, the distinct rows will be based on all columns.

If keep is specified, the first or last row will be kept.


drop

Remove columns from the table.


dropna

Remove rows from the table with missing values.

If on is specified, the missing values will be checked for the columns in on. If it is not, the missing values will be checked for all columns.

If how is “any”, the row will be removed if any of the values are missing. If it is “all”, the row will be removed if all of the values are missing.


eda

Return summary statistics for each column in the table.

If cols is specified, the summary statistics will be returned for the columns in cols. If it is not, the summary statistics will be returned for all columns.


execute

Run the query and return the result in the specified format. If twin is True, the twin connection will be used.


filter

Filter the table based on the conditions specified. This function should be used in place of WHERE, HAVING, and QUALIFY clauses in SQL.


filter_all

Similar to other “_all” variants, this is a generalized form of filter that can apply the same operation (specified in condition_f) to multiple columns.

The col_selector field can be a list of column fields, where each element select, and condition_f should be a list of functions of the same length.

Useful if you want to apply the same filter (e.g. value > 0) to multiple columns.

Conditions are evaluated together using the condition_type argument. If condition_type is “and”, all conditions must be met. If condition_type is “or”, any condition can be met. If you’d like to use a mix of “and” and “or” conditions, call the filter function multiple times.


from_file

Create a VinylTable from a csv, json, or parquet file


from_memory

Create a VinylTable from a pandas, polars, or pyarrow object


get_name

Return the name of the table.


limit

Return the first n rows of the table, starting at the offset row.

Note that the result set may not be idempotent.


pickle

None


relocate

Relocate columns before or after other specified columns.


rename

Rename columns in the table. The rename_dict should be a dictionary with the new column name as the key and the original column name as the value.


sample

Sample a fraction of rows from a table. Results may not be idempotent.

See specific note from Ibis below:

Sampling is by definition a random operation. Some backends support specifying a seed for repeatable results, but not all backends support that option. And some backends (duckdb, for example) do support specifying a seed but may still not have repeatable results in all cases. In all cases, results are backend-specific. An execution against one backend is unlikely to sample the same rows when executed against a different backend, even with the same seed set.


save

Run the query and save the result to the specified path in the specified format.


schema

Return the schema of the table.


select

Computes a new table with the columns in cols. Can be a single column, a list of columns, or a dictionary of columns with their new names as keys. The column values themselves can be specified as strings (the column name), table attributes, one-argument lambda functions, or selectors.

If an aggregated column is passed, this will be treated as a windowed column, using the by field for partitioning, the sort field for ordering, and the window_type and window_bounds fields for the actual window.

Fill can be used optionally to add interpolation to cols. You must either specify one value for each column or a list of values that is the same length as the column list


select_all

Select_all is a generalized form of select that can apply apply the same operation (specified in f) to multiple columns. The col_selector field can be a list of column fields, where each element select, and f should be a list of functions of the same length.

If f is a single function, it will be applied to all columns. If f is a list of functions, the functions will be applied to the corresponding columns. If f is shorter than the number of columns, the last function will be applied to all remaining columns.

By, sort, window_type, and window_bounds operate as in select.

If rename is True, the columns will be renamed to the name of the function that was applied to them. If rename is False, the columns will names to the original column name.


sort

Sort the table by the columns in by.

If by is not specified, the table will be sorted by all columns.

To sort a column in descending order, place a - in front of the column.


to_sql

Output the table as a SQL string. The dialect argument can be used to specify the SQL dialect to use.

If optimized is True, the SQL will be optimized using the SQLglot optimizer. If formatted is True, the SQL will be formatted for readability.


visualize

Print a visualize representation of the query plan