Encodings¶

The key to creating meaningful visualizations is to map properties of the data to visual properties in order to effectively communicate information. In Altair, this mapping of visual properties to data columns is referred to as an encoding, and is most often expressed through the Chart.encode() method.

For example, here we will visualize the cars dataset using four of the available encodings: x (the x-axis value), y (the y-axis value), color (the color of the marker), and shape (the shape of the point marker):

import altair as alt
from vega_datasets import data
cars = data.cars()

alt.Chart(cars).mark_point().encode(
    x='Horsepower',
    y='Miles_per_Gallon',
    color='Origin',
    shape='Origin'
)

For data specified as a DataFrame, Altair can automatically determine the correct data type for each encoding, and creates appropriate scales and legends to represent the data.

Encoding Channels¶

Altair provides a number of encoding channels that can be useful in different circumstances; the following table summarizes them:

Position Channels:

Channel	Altair Class	Description	Example
x	`X`	The x-axis value	Simple Scatter Plot
y	`Y`	The y-axis value	Simple Scatter Plot
x2	`X2`	Second x value for ranges	Error Bars showing Confidence Interval
y2	`Y2`	Second y value for ranges	Line chart with Confidence Interval Band
longitude	`Longitude`	Longitude for geo charts	Locations of US Airports
latitude	`Latitude`	Latitude for geo charts	Locations of US Airports
longitude2	`Longitude2`	Second longitude value for ranges	N/A
latitude2	`Latitude2`	Second latitude value for ranges	N/A

Mark Property Channels:

Channel	Altair Class	Description	Example
color	`Color`	The color of the mark	Simple Heatmap
fill	`Fill`	The fill for the mark	N/A
opacity	`Opacity`	The opacity of the mark	Horizon Graph
shape	`Shape`	The shape of the mark	N/A
size	`Size`	The size of the mark	Table Bubble Plot (Github Punch Card)
stroke	`Stroke`	The stroke of the mark	N/A

Text and Tooltip Channels:

Channel	Altair Class	Description	Example
text	`Text`	Text to use for the mark	Simple Scatter Plot with Labels
key	`Key`	–	N/A
tooltip	`Tooltip`	The tooltip value	Scatter Plot with Tooltips

Hyperlink Channel:

Channel	Altair Class	Description	Example
href	`Href`	Hyperlink for points	N/A

Level of Detail Channel:

Channel	Altair Class	Description	Example
detail	`Detail`	Additional property to group by	Selection Detail Example

Order Channel:

Channel	Altair Class	Description	Example
order	`Order`	Sets the order of the marks	Connected Scatterplot (Lines with Custom Paths)

Facet Channels:

Channel	Altair Class	Description	Example
column	`Column`	The column of a faceted plot	Trellis Scatter Plot
row	`Row`	The row of a faceted plot	Becker’s Barley Trellis Plot

Data Types¶

The details of any mapping depend on the type of the data. Altair recognizes four main data types:

Data Type	Shorthand Code	Description
quantitative	`Q`	a continuous real-valued quantity
ordinal	`O`	a discrete ordered quantity
nominal	`N`	a discrete unordered category
temporal	`T`	a time or date value

If types are not specified for data input as a DataFrame, Altair defaults to quantitative for any numeric data, temporal for date/time data, and nominal for string data, but be aware that these defaults are by no means always the correct choice!

The types can either be expressed in a long-form using the channel encoding classes such as X and Y, or in short-form using the Shorthand Syntax discussed below. For example, the following two methods of specifying the type will lead to identical plots:

alt.Chart(cars).mark_point().encode(
    x='Acceleration:Q',
    y='Miles_per_Gallon:Q',
    color='Origin:N'
)

alt.Chart(cars).mark_point().encode(
    alt.X('Acceleration', type='quantitative'),
    alt.Y('Miles_per_Gallon', type='quantitative'),
    alt.Color('Origin', type='nominal')
)

The shorthand form, x="name:Q", is useful for its lack of boilerplate when doing quick data explorations. The long-form, alt.X('name', type='quantitative'), is useful when doing more fine-tuned adjustments to the encoding, such as binning, axis and scale properties, or more.

Specifying the correct type for your data is important, as it affects the way Altair represents your encoding in the resulting plot.

Effect of Data Type on Color Scales¶

As an example of this, here we will represent the same data three different ways, with the color encoded as a quantitative, ordinal, and nominal type, using three vertically-concatenated charts (see Vertical Concatenation):

base = alt.Chart(cars).mark_point().encode(
    x='Horsepower:Q',
    y='Miles_per_Gallon:Q',
).properties(
    width=150,
    height=150
)

alt.vconcat(
   base.encode(color='Cylinders:Q').properties(title='quantitative'),
   base.encode(color='Cylinders:O').properties(title='ordinal'),
   base.encode(color='Cylinders:N').properties(title='nominal'),
)

The type specification influences the way Altair, via Vega-Lite, decides on the color scale to represent the value, and influences whether a discrete or continuous legend is used.

Effect of Data Type on Axis Scales¶

Similarly, for x and y axis encodings, the type used for the data will affect the scales used and the characteristics of the mark. For example, here is the difference between a quantitative and ordinal scale for an column that contains integers specifying a year:

pop = data.population.url

base = alt.Chart(pop).mark_bar().encode(
    alt.Y('mean(people):Q', axis=alt.Axis(title='total population'))
).properties(
    width=200,
    height=200
)

alt.hconcat(
    base.encode(x='year:Q').properties(title='year=quantitative'),
    base.encode(x='year:O').properties(title='year=ordinal')
)

In altair, quantitative scales always start at zero unless otherwise specified, while ordinal scales are limited to the values within the data.

Overriding the behavior of including zero in the axis, we see that even then the precise appearance of the marks representing the data are affected by the data type:

base.encode(
    alt.X('year:Q',
        scale=alt.Scale(zero=False)
    )
)

Because quantitative values do not have an inherent width, the bars do not fill the entire space between the values. This view also makes clear the missing year of data that was not immediately apparent when we treated the years as categories.

This kind of behavior is sometimes surprising to new users, but it emphasizes the importance of thinking carefully about your data types when visualizing data: a visual encoding that is suitable for categorical data may not be suitable for quantitative data, and vice versa.

Encoding Channel Options¶

Each encoding channel allows for a number of additional options to be expressed; these can control things like axis properties, scale properties, headers and titles, binning parameters, aggregation, sorting, and many more.

The particular options that are available vary by encoding type; the various options are listed below.

The X and Y encodings accept the following options:

Property	Type	Description
aggregate	`Aggregate`	Aggregation function for the field (e.g., `mean`, `sum`, `median`, `min`, `max`, `count`). Default value: `undefined` (None)
axis	anyOf(`Axis`, `null`)	An object defining properties of axis’s gridlines, ticks and labels. If `null`, the axis for the encoding channel will be removed. Default value: If undefined, default axis properties are applied.
bin	anyOf(`boolean`, `BinParams`)	A flag for binning a `quantitative` field, or an object defining binning parameters. If `true`, default binning parameters will be applied. Default value: `false`
field	anyOf(`string`, `RepeatRef`)	Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the `repeat` operator. Note: Dots (`.`) and brackets (`[` and `]`) can be used to access nested objects (e.g., `"field": "foo.bar"` and `"field": "foo['bar']"`). If field names contain dots or brackets but are not nested, you can use `\\` to escape dots and brackets (e.g., `"a\\.b"` and `"a\\[0\\]"`). See more details about escaping in the field documentation. Note: `field` is not required if `aggregate` is `count`.
scale	anyOf(`Scale`, `null`)	An object defining properties of the channel’s scale, which is the function that transforms values in the data domain (numbers, dates, strings, etc) to visual values (pixels, colors, sizes) of the encoding channels. If `null`, the scale will be disabled and the data value will be directly encoded. Default value: If undefined, default scale properties are applied.
sort	`Sort`	Sort order for the encoded field. For continuous fields (quantitative or temporal), `sort` can be either `"ascending"` or `"descending"`. For discrete fields, `sort` can be one of the following: `"ascending"` or `"descending"` – for sorting by the values’ natural order in Javascript. A sort field definition for sorting by another field. An array specifying the field values in preferred order. In this case, the sort order will obey the values in the array, followed by any unspecified values in their original order. For discrete time field, values in the sort array can be date-time definition objects. In addition, for time units `"month"` and `"day"`, the values can be the month or day names (case insensitive) or their 3-letter initials (e.g., `"Mon"`, `"Tue"`). `null` indicating no sort. Default value: `"ascending"` Note: `null` is not supported for `row` and `column`.
stack	anyOf(`StackOffset`, `null`)	Type of stacking offset if the field should be stacked. `stack` is only applicable for `x` and `y` channels with continuous domains. For example, `stack` of `y` can be used to customize stacking for a vertical bar chart. `stack` can be one of the following values: `"zero"`: stacking with baseline offset at zero value of the scale (for creating typical stacked bar and area chart). `"normalize"` - stacking with normalized domain (for creating normalized stacked bar and area charts. -`"center"` - stacking with center baseline (for streamgraph). `null` - No-stacking. This will produce layered bar and area chart. Default value: `zero` for plots with all of the following conditions are true: (1) the mark is `bar` or `area`; (2) the stacked measure channel (x or y) has a linear scale; (3) At least one of non-position channels mapped to an unaggregated field that is different from x and y. Otherwise, `null` by default.
timeUnit	`TimeUnit`	Time unit (e.g., `year`, `yearmonth`, `month`, `hours`) for a temporal field. or a temporal field that gets casted as ordinal. Default value: `undefined` (None)
title	[string, null]	A title for the field. If `null`, the title will be removed. Default value: derived from the field’s name and transformation function (`aggregate`, `bin` and `timeUnit`). If the field has an aggregate function, the function is displayed as part of the title (e.g., `"Sum of Profit"`). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., `"Profit (binned)"`, `"Transaction Date (year-month)"`). Otherwise, the title is simply the field name. Notes: You can customize the default field title format by providing the [`fieldTitle` property in the config or `fieldTitle` function via the `compile` function’s options. If both field definition’s `title` and axis, header, or legend `title` are defined, axis/header/legend title will be used.
type	`Type`	The encoded field’s type of measurement (`"quantitative"`, `"temporal"`, `"ordinal"`, or `"nominal"`). It can also be a `"geojson"` type for encoding ‘geoshape’.

The Color, Fill, Opacity, Shape, Size, and Stroke encodings accept the following options:

Property	Type	Description
aggregate	`Aggregate`	Aggregation function for the field (e.g., `mean`, `sum`, `median`, `min`, `max`, `count`). Default value: `undefined` (None)
bin	anyOf(`boolean`, `BinParams`)	A flag for binning a `quantitative` field, or an object defining binning parameters. If `true`, default binning parameters will be applied. Default value: `false`
condition	anyOf(`ConditionalValueDef`, array(`ConditionalValueDef`))	One or more value definition(s) with a selection predicate. Note: A field definition’s `condition` property can only contain value definitions since Vega-Lite only allows at most one encoded field per encoding channel.
field	anyOf(`string`, `RepeatRef`)	Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the `repeat` operator. Note: Dots (`.`) and brackets (`[` and `]`) can be used to access nested objects (e.g., `"field": "foo.bar"` and `"field": "foo['bar']"`). If field names contain dots or brackets but are not nested, you can use `\\` to escape dots and brackets (e.g., `"a\\.b"` and `"a\\[0\\]"`). See more details about escaping in the field documentation. Note: `field` is not required if `aggregate` is `count`.
legend	anyOf(`Legend`, `null`)	An object defining properties of the legend. If `null`, the legend for the encoding channel will be removed. Default value: If undefined, default legend properties are applied.
scale	anyOf(`Scale`, `null`)	An object defining properties of the channel’s scale, which is the function that transforms values in the data domain (numbers, dates, strings, etc) to visual values (pixels, colors, sizes) of the encoding channels. If `null`, the scale will be disabled and the data value will be directly encoded. Default value: If undefined, default scale properties are applied.
sort	`Sort`	Sort order for the encoded field. For continuous fields (quantitative or temporal), `sort` can be either `"ascending"` or `"descending"`. For discrete fields, `sort` can be one of the following: `"ascending"` or `"descending"` – for sorting by the values’ natural order in Javascript. A sort field definition for sorting by another field. An array specifying the field values in preferred order. In this case, the sort order will obey the values in the array, followed by any unspecified values in their original order. For discrete time field, values in the sort array can be date-time definition objects. In addition, for time units `"month"` and `"day"`, the values can be the month or day names (case insensitive) or their 3-letter initials (e.g., `"Mon"`, `"Tue"`). `null` indicating no sort. Default value: `"ascending"` Note: `null` is not supported for `row` and `column`.
timeUnit	`TimeUnit`	Time unit (e.g., `year`, `yearmonth`, `month`, `hours`) for a temporal field. or a temporal field that gets casted as ordinal. Default value: `undefined` (None)
title	[string, null]	A title for the field. If `null`, the title will be removed. Default value: derived from the field’s name and transformation function (`aggregate`, `bin` and `timeUnit`). If the field has an aggregate function, the function is displayed as part of the title (e.g., `"Sum of Profit"`). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., `"Profit (binned)"`, `"Transaction Date (year-month)"`). Otherwise, the title is simply the field name. Notes: You can customize the default field title format by providing the [`fieldTitle` property in the config or `fieldTitle` function via the `compile` function’s options. If both field definition’s `title` and axis, header, or legend `title` are defined, axis/header/legend title will be used.
type	`Type`	The encoded field’s type of measurement (`"quantitative"`, `"temporal"`, `"ordinal"`, or `"nominal"`). It can also be a `"geojson"` type for encoding ‘geoshape’.

The Row and Column encodings accept the following options:

Property	Type	Description
aggregate	`Aggregate`	Aggregation function for the field (e.g., `mean`, `sum`, `median`, `min`, `max`, `count`). Default value: `undefined` (None)
bin	anyOf(`boolean`, `BinParams`)	A flag for binning a `quantitative` field, or an object defining binning parameters. If `true`, default binning parameters will be applied. Default value: `false`
field	anyOf(`string`, `RepeatRef`)	Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the `repeat` operator. Note: Dots (`.`) and brackets (`[` and `]`) can be used to access nested objects (e.g., `"field": "foo.bar"` and `"field": "foo['bar']"`). If field names contain dots or brackets but are not nested, you can use `\\` to escape dots and brackets (e.g., `"a\\.b"` and `"a\\[0\\]"`). See more details about escaping in the field documentation. Note: `field` is not required if `aggregate` is `count`.
header	`Header`	An object defining properties of a facet’s header.
sort	`Sort`	Sort order for the encoded field. For continuous fields (quantitative or temporal), `sort` can be either `"ascending"` or `"descending"`. For discrete fields, `sort` can be one of the following: `"ascending"` or `"descending"` – for sorting by the values’ natural order in Javascript. A sort field definition for sorting by another field. An array specifying the field values in preferred order. In this case, the sort order will obey the values in the array, followed by any unspecified values in their original order. For discrete time field, values in the sort array can be date-time definition objects. In addition, for time units `"month"` and `"day"`, the values can be the month or day names (case insensitive) or their 3-letter initials (e.g., `"Mon"`, `"Tue"`). `null` indicating no sort. Default value: `"ascending"` Note: `null` is not supported for `row` and `column`.
timeUnit	`TimeUnit`	Time unit (e.g., `year`, `yearmonth`, `month`, `hours`) for a temporal field. or a temporal field that gets casted as ordinal. Default value: `undefined` (None)
title	[string, null]	A title for the field. If `null`, the title will be removed. Default value: derived from the field’s name and transformation function (`aggregate`, `bin` and `timeUnit`). If the field has an aggregate function, the function is displayed as part of the title (e.g., `"Sum of Profit"`). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., `"Profit (binned)"`, `"Transaction Date (year-month)"`). Otherwise, the title is simply the field name. Notes: You can customize the default field title format by providing the [`fieldTitle` property in the config or `fieldTitle` function via the `compile` function’s options. If both field definition’s `title` and axis, header, or legend `title` are defined, axis/header/legend title will be used.
type	`Type`	The encoded field’s type of measurement (`"quantitative"`, `"temporal"`, `"ordinal"`, or `"nominal"`). It can also be a `"geojson"` type for encoding ‘geoshape’.

The Text and Tooltip encodings accept the following options:

Property	Type	Description
aggregate	`Aggregate`	Aggregation function for the field (e.g., `mean`, `sum`, `median`, `min`, `max`, `count`). Default value: `undefined` (None)
bin	anyOf(`boolean`, `BinParams`)	A flag for binning a `quantitative` field, or an object defining binning parameters. If `true`, default binning parameters will be applied. Default value: `false`
condition	anyOf(`ConditionalValueDef`, array(`ConditionalValueDef`))	One or more value definition(s) with a selection predicate. Note: A field definition’s `condition` property can only contain value definitions since Vega-Lite only allows at most one encoded field per encoding channel.
field	anyOf(`string`, `RepeatRef`)	Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the `repeat` operator. Note: Dots (`.`) and brackets (`[` and `]`) can be used to access nested objects (e.g., `"field": "foo.bar"` and `"field": "foo['bar']"`). If field names contain dots or brackets but are not nested, you can use `\\` to escape dots and brackets (e.g., `"a\\.b"` and `"a\\[0\\]"`). See more details about escaping in the field documentation. Note: `field` is not required if `aggregate` is `count`.
format	`string`	The formatting pattern for a text field. If not defined, this will be determined automatically.
timeUnit	`TimeUnit`	Time unit (e.g., `year`, `yearmonth`, `month`, `hours`) for a temporal field. or a temporal field that gets casted as ordinal. Default value: `undefined` (None)
title	[string, null]	A title for the field. If `null`, the title will be removed. Default value: derived from the field’s name and transformation function (`aggregate`, `bin` and `timeUnit`). If the field has an aggregate function, the function is displayed as part of the title (e.g., `"Sum of Profit"`). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., `"Profit (binned)"`, `"Transaction Date (year-month)"`). Otherwise, the title is simply the field name. Notes: You can customize the default field title format by providing the [`fieldTitle` property in the config or `fieldTitle` function via the `compile` function’s options. If both field definition’s `title` and axis, header, or legend `title` are defined, axis/header/legend title will be used.
type	`Type`	The encoded field’s type of measurement (`"quantitative"`, `"temporal"`, `"ordinal"`, or `"nominal"`). It can also be a `"geojson"` type for encoding ‘geoshape’.

The Detail, Key, Latitude, Latitude2, Longitude, Longitude2, X2 and Y2 encodings accept the following options:

Property	Type	Description
aggregate	`Aggregate`	Aggregation function for the field (e.g., `mean`, `sum`, `median`, `min`, `max`, `count`). Default value: `undefined` (None)
bin	anyOf(`boolean`, `BinParams`)	A flag for binning a `quantitative` field, or an object defining binning parameters. If `true`, default binning parameters will be applied. Default value: `false`
field	anyOf(`string`, `RepeatRef`)	Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the `repeat` operator. Note: Dots (`.`) and brackets (`[` and `]`) can be used to access nested objects (e.g., `"field": "foo.bar"` and `"field": "foo['bar']"`). If field names contain dots or brackets but are not nested, you can use `\\` to escape dots and brackets (e.g., `"a\\.b"` and `"a\\[0\\]"`). See more details about escaping in the field documentation. Note: `field` is not required if `aggregate` is `count`.
timeUnit	`TimeUnit`	Time unit (e.g., `year`, `yearmonth`, `month`, `hours`) for a temporal field. or a temporal field that gets casted as ordinal. Default value: `undefined` (None)
title	[string, null]	A title for the field. If `null`, the title will be removed. Default value: derived from the field’s name and transformation function (`aggregate`, `bin` and `timeUnit`). If the field has an aggregate function, the function is displayed as part of the title (e.g., `"Sum of Profit"`). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., `"Profit (binned)"`, `"Transaction Date (year-month)"`). Otherwise, the title is simply the field name. Notes: You can customize the default field title format by providing the [`fieldTitle` property in the config or `fieldTitle` function via the `compile` function’s options. If both field definition’s `title` and axis, header, or legend `title` are defined, axis/header/legend title will be used.
type	`Type`	The encoded field’s type of measurement (`"quantitative"`, `"temporal"`, `"ordinal"`, or `"nominal"`). It can also be a `"geojson"` type for encoding ‘geoshape’.

The Href encoding accepts the following options:

Property	Type	Description
aggregate	`Aggregate`	Aggregation function for the field (e.g., `mean`, `sum`, `median`, `min`, `max`, `count`). Default value: `undefined` (None)
bin	anyOf(`boolean`, `BinParams`)	A flag for binning a `quantitative` field, or an object defining binning parameters. If `true`, default binning parameters will be applied. Default value: `false`
condition	anyOf(`ConditionalValueDef`, array(`ConditionalValueDef`))	One or more value definition(s) with a selection predicate. Note: A field definition’s `condition` property can only contain value definitions since Vega-Lite only allows at most one encoded field per encoding channel.
field	anyOf(`string`, `RepeatRef`)	Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the `repeat` operator. Note: Dots (`.`) and brackets (`[` and `]`) can be used to access nested objects (e.g., `"field": "foo.bar"` and `"field": "foo['bar']"`). If field names contain dots or brackets but are not nested, you can use `\\` to escape dots and brackets (e.g., `"a\\.b"` and `"a\\[0\\]"`). See more details about escaping in the field documentation. Note: `field` is not required if `aggregate` is `count`.
timeUnit	`TimeUnit`	Time unit (e.g., `year`, `yearmonth`, `month`, `hours`) for a temporal field. or a temporal field that gets casted as ordinal. Default value: `undefined` (None)
title	[string, null]	A title for the field. If `null`, the title will be removed. Default value: derived from the field’s name and transformation function (`aggregate`, `bin` and `timeUnit`). If the field has an aggregate function, the function is displayed as part of the title (e.g., `"Sum of Profit"`). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., `"Profit (binned)"`, `"Transaction Date (year-month)"`). Otherwise, the title is simply the field name. Notes: You can customize the default field title format by providing the [`fieldTitle` property in the config or `fieldTitle` function via the `compile` function’s options. If both field definition’s `title` and axis, header, or legend `title` are defined, axis/header/legend title will be used.
type	`Type`	The encoded field’s type of measurement (`"quantitative"`, `"temporal"`, `"ordinal"`, or `"nominal"`). It can also be a `"geojson"` type for encoding ‘geoshape’.

The Order encoding accepts the following options:

Property	Type	Description
aggregate	`Aggregate`	Aggregation function for the field (e.g., `mean`, `sum`, `median`, `min`, `max`, `count`). Default value: `undefined` (None)
bin	anyOf(`boolean`, `BinParams`)	A flag for binning a `quantitative` field, or an object defining binning parameters. If `true`, default binning parameters will be applied. Default value: `false`
field	anyOf(`string`, `RepeatRef`)	Required. A string defining the name of the field from which to pull a data value or an object defining iterated values from the `repeat` operator. Note: Dots (`.`) and brackets (`[` and `]`) can be used to access nested objects (e.g., `"field": "foo.bar"` and `"field": "foo['bar']"`). If field names contain dots or brackets but are not nested, you can use `\\` to escape dots and brackets (e.g., `"a\\.b"` and `"a\\[0\\]"`). See more details about escaping in the field documentation. Note: `field` is not required if `aggregate` is `count`.
sort	`SortOrder`	The sort order. One of `"ascending"` (default) or `"descending"`.
timeUnit	`TimeUnit`	Time unit (e.g., `year`, `yearmonth`, `month`, `hours`) for a temporal field. or a temporal field that gets casted as ordinal. Default value: `undefined` (None)
title	[string, null]	A title for the field. If `null`, the title will be removed. Default value: derived from the field’s name and transformation function (`aggregate`, `bin` and `timeUnit`). If the field has an aggregate function, the function is displayed as part of the title (e.g., `"Sum of Profit"`). If the field is binned or has a time unit applied, the applied function is shown in parentheses (e.g., `"Profit (binned)"`, `"Transaction Date (year-month)"`). Otherwise, the title is simply the field name. Notes: You can customize the default field title format by providing the [`fieldTitle` property in the config or `fieldTitle` function via the `compile` function’s options. If both field definition’s `title` and axis, header, or legend `title` are defined, axis/header/legend title will be used.
type	`Type`	The encoded field’s type of measurement (`"quantitative"`, `"temporal"`, `"ordinal"`, or `"nominal"`). It can also be a `"geojson"` type for encoding ‘geoshape’.

Binning and Aggregation¶

Beyond simple channel encodings, Altair’s visualizations are built on the concept of the database-style grouping and aggregation; that is, the split-apply-combine abstraction that underpins many data analysis approaches.

For example, building a histogram from a one-dimensional dataset involves splitting data based on the bin it falls in, aggregating the results within each bin using a count of the data, and then combining the results into a final figure.

In Altair, such an operation looks like this:

alt.Chart(cars).mark_bar().encode(
    alt.X('Horsepower', bin=True),
    y='count()'
    # could also use alt.Y(aggregate='count', type='quantitative')
)

Notice here we use the shorthand version of expressing an encoding channel (see Encoding Shorthands) with the count aggregation, which is the one aggregation that does not require a field to be specified.

Similarly, we can create a two-dimensional histogram using, for example, the size of points to indicate counts within the grid (sometimes called a “Bubble Plot”):

alt.Chart(cars).mark_point().encode(
    alt.X('Horsepower', bin=True),
    alt.Y('Miles_per_Gallon', bin=True),
    size='count()',
)

There is no need, however, to limit aggregations to counts alone. For example, we could similarly create a plot where the color of each point represents the mean of a third quantity, such as acceleration:

alt.Chart(cars).mark_circle().encode(
    alt.X('Horsepower', bin=True),
    alt.Y('Miles_per_Gallon', bin=True),
    size='count()',
    color='average(Acceleration):Q'
)

In addition to count and average, there are a large number of available aggregation functions built into Altair; they are listed in the following table:

Aggregate	Description	Example
argmin	An input data object containing the minimum field value.	N/A
argmax	An input data object containing the maximum field value.	N/A
average	The mean (average) field value. Identical to mean.	Line Chart with Layered Aggregates
count	The total count of data objects in the group.	Simple Heatmap
distinct	The count of distinct field values.	N/A
max	The maximum field value.	Box Plot with Min/Max Whiskers
mean	The mean (average) field value.	Layered Plot with Dual-Axis
median	The median field value	Box Plot with Min/Max Whiskers
min	The minimum field value.	Box Plot with Min/Max Whiskers
missing	The count of null or undefined field values.	N/A
q1	The lower quartile boundary of values.	Box Plot with Min/Max Whiskers
q3	The upper quartile boundary of values.	Box Plot with Min/Max Whiskers
ci0	The lower boundary of the bootstrapped 95% confidence interval of the mean.	Error Bars showing Confidence Interval
ci1	The upper boundary of the bootstrapped 95% confidence interval of the mean.	Error Bars showing Confidence Interval
stderr	The standard error of the field values.	N/A
stdev	The sample standard deviation of field values.	N/A
stdevp	The population standard deviation of field values.	N/A
sum	The sum of field values.	Streamgraph
valid	The count of field values that are not null or undefined.	N/A
values	??	N/A
variance	The sample variance of field values.	N/A
variancep	The population variance of field values.	N/A

Encoding Shorthands¶

For convenience, Altair allows the specification of the variable name along with the aggregate and type within a simple shorthand string syntax. This makes use of the type shorthand codes listed in Data Types as well as the aggregate names listed in Binning and Aggregation. The following table shows examples of the shorthand specification alongside the long-form equivalent:

Shorthand	Equivalent long-form
`x='name'`	`alt.X('name')`
`x='name:Q'`	`alt.X('name', type='quantitative')`
`x='sum(name)'`	`alt.X('name', aggregate='sum')`
`x='sum(name):Q'`	`alt.X('name', aggregate='sum', type='quantitative')`
`x='count():Q'`	`alt.X(aggregate='count', type='quantitative')`

Ordering marks¶

The order option and Order channel can sort how marks are drawn on the chart.

For stacked marks, this controls the order of components of the stack. Here, the elements of each bar are sorted alphabetically by the name of the nominal data in the color channel.

import altair as alt
from vega_datasets import data

barley = data.barley()

alt.Chart(barley).mark_bar().encode(
    x='variety:N',
    y='sum(yield):Q',
    color='site:N',
    order=alt.Order("site", sort="ascending")
)

The order can be reversed by changing the sort option to descending.

import altair as alt
from vega_datasets import data

barley = data.barley()

alt.Chart(barley).mark_bar().encode(
    x='variety:N',
    y='sum(yield):Q',
    color='site:N',
    order=alt.Order("site", sort="descending")
)

The same approach works for other mark types, like stacked areas charts.

import altair as alt
from vega_datasets import data

barley = data.barley()

alt.Chart(barley).mark_area().encode(
    x='variety:N',
    y='sum(yield):Q',
    color='site:N',
    order=alt.Order("site", sort="ascending")
)

For line marks, the order channel encodes the order in which data points are connected. This can be useful for creating a scatterplot that draws lines between the dots using a different field than the x and y axes.

import altair as alt
from vega_datasets import data

driving = data.driving()

alt.Chart(driving).mark_line(point=True).encode(
    alt.X('miles', scale=alt.Scale(zero=False)),
    alt.Y('gas', scale=alt.Scale(zero=False)),
    order='year'
)