Indexing and selecting data#
DataArrays and Datasets with an xvec.GeometryIndex support standard indexing, slicing and selection from Xarray on non-geometric dimensions plus specific spatial indexing options based on geometric dimensions. To make the example more interesting, create a Dataset of trips between individual taxi zones in New York City in January 2022.
import datetime
import geopandas as gpd
import pandas as pd
import xarray as xr
from shapely import Point, box
import xvec
You can index the data by the payment type, day of the month, the hour of the day, origin zone and destination zone. For example, you can check trip count, mean trip distance, fare amount and tip amount. It may be better to create the data as sparse arrays, but those do not support all indexing methods, so it is better to use dense arrays in this example.
Show code cell source
Hide code cell source
# Load the data
trips = pd.read_parquet(
"https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2022-01.parquet"
) # 33MB
zones = gpd.read_file(
"https://d37ci6vzurychx.cloudfront.net/misc/taxi_zones.zip"
) # 1MB
lookup = pd.read_csv("https://d37ci6vzurychx.cloudfront.net/misc/taxi+_zone_lookup.csv")
# create variables for day and hour
trips["date"] = trips.tpep_pickup_datetime.dt.date
trips["hour"] = trips.tpep_pickup_datetime.dt.hour
# use groupby over five columns to create a mutli-indexed DataFrame with aggregations
# and create a Dataset backed by sparse arrays
taxi_trips = xr.Dataset.from_dataframe(
trips[ # filter only trips with known locations
trips.PULocationID.isin(zones.LocationID)
& trips.DOLocationID.isin(zones.LocationID)
]
.groupby(["payment_type", "date", "hour", "PULocationID", "DOLocationID"])
.agg(
{
"trip_distance": "mean",
"VendorID": "count",
"tip_amount": "mean",
"fare_amount": "mean",
}
),
)
# Replace int codes with labels
taxi_trips["payment_type"] = [
"Credit card",
"Cash",
"No charge",
"Dispute",
"Unknown",
"Voided trip",
]
# create linkable geometry variable
taxi_zones = (
lookup.merge(
zones.dissolve("LocationID")[["zone", "geometry"]],
left_on="Zone",
right_on="zone",
how="left",
)
.set_index("LocationID")
.geometry
)
# replace location IDs with actual geometries
taxi_trips["PULocationID"] = taxi_zones.loc[taxi_trips.PULocationID].values
taxi_trips["DOLocationID"] = taxi_zones.loc[taxi_trips.DOLocationID].values
# rename
taxi_trips = taxi_trips.rename(
{"PULocationID": "origin", "DOLocationID": "destination", "VendorID": "trips_count"}
)
# assing GeometryIndex
taxi_trips = taxi_trips.xvec.set_geom_indexes(["origin", "destination"], crs=zones.crs)
taxi_trips
<xarray.Dataset>
Dimensions: (payment_type: 6, date: 40, hour: 24, origin: 252,
destination: 257)
Coordinates:
* payment_type (payment_type) <U11 'Credit card' 'Cash' ... 'Voided trip'
* date (date) object 2008-12-31 2009-01-01 ... 2022-04-06 2022-05-18
* hour (hour) int32 0 1 2 3 4 5 6 7 8 ... 15 16 17 18 19 20 21 22 23
* origin (origin) object POLYGON ((933100.9183527103 192536.0856972...
* destination (destination) object POLYGON ((933100.9183527103 192536.08...
Data variables:
trip_distance (payment_type, date, hour, origin, destination) float64 na...
trips_count (payment_type, date, hour, origin, destination) float64 na...
tip_amount (payment_type, date, hour, origin, destination) float64 na...
fare_amount (payment_type, date, hour, origin, destination) float64 na...
Indexes:
origin GeometryIndex (crs=EPSG:2263)
destination GeometryIndex (crs=EPSG:2263)- payment_type: 6
- date: 40
- hour: 24
- origin: 252
- destination: 257
- payment_type(payment_type)<U11'Credit card' ... 'Voided trip'
array(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='<U11') - date(date)object2008-12-31 ... 2022-05-18
array([datetime.date(2008, 12, 31), datetime.date(2009, 1, 1), datetime.date(2021, 12, 31), datetime.date(2022, 1, 1), datetime.date(2022, 1, 2), datetime.date(2022, 1, 3), datetime.date(2022, 1, 4), datetime.date(2022, 1, 5), datetime.date(2022, 1, 6), datetime.date(2022, 1, 7), datetime.date(2022, 1, 8), datetime.date(2022, 1, 9), datetime.date(2022, 1, 10), datetime.date(2022, 1, 11), datetime.date(2022, 1, 12), datetime.date(2022, 1, 13), datetime.date(2022, 1, 14), datetime.date(2022, 1, 15), datetime.date(2022, 1, 16), datetime.date(2022, 1, 17), datetime.date(2022, 1, 18), datetime.date(2022, 1, 19), datetime.date(2022, 1, 20), datetime.date(2022, 1, 21), datetime.date(2022, 1, 22), datetime.date(2022, 1, 23), datetime.date(2022, 1, 24), datetime.date(2022, 1, 25), datetime.date(2022, 1, 26), datetime.date(2022, 1, 27), datetime.date(2022, 1, 28), datetime.date(2022, 1, 29), datetime.date(2022, 1, 30), datetime.date(2022, 1, 31), datetime.date(2022, 2, 1), datetime.date(2022, 2, 22), datetime.date(2022, 3, 9), datetime.date(2022, 3, 15), datetime.date(2022, 4, 6), datetime.date(2022, 5, 18)], dtype=object) - hour(hour)int320 1 2 3 4 5 6 ... 18 19 20 21 22 23
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=int32) - origin(origin)objectPOLYGON ((933100.9183527103 1925...
- crs :
- EPSG:2263
array([<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...>, <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...>, <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....>, ..., <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...>, <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...>, <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], dtype=object) - destination(destination)objectPOLYGON ((933100.9183527103 1925...
- crs :
- EPSG:2263
array([<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...>, <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...>, <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....>, ..., <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...>, <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...>, <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], dtype=object)
- trip_distance(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - trips_count(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - tip_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - fare_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]])
- payment_typePandasIndex
PandasIndex(Index(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='object', name='payment_type')) - datePandasIndex
PandasIndex(Index([2008-12-31, 2009-01-01, 2021-12-31, 2022-01-01, 2022-01-02, 2022-01-03, 2022-01-04, 2022-01-05, 2022-01-06, 2022-01-07, 2022-01-08, 2022-01-09, 2022-01-10, 2022-01-11, 2022-01-12, 2022-01-13, 2022-01-14, 2022-01-15, 2022-01-16, 2022-01-17, 2022-01-18, 2022-01-19, 2022-01-20, 2022-01-21, 2022-01-22, 2022-01-23, 2022-01-24, 2022-01-25, 2022-01-26, 2022-01-27, 2022-01-28, 2022-01-29, 2022-01-30, 2022-01-31, 2022-02-01, 2022-02-22, 2022-03-09, 2022-03-15, 2022-04-06, 2022-05-18], dtype='object', name='date')) - hourPandasIndex
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype='int32', name='hour')) - originGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...> <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...> <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....> <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...> ... <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...> <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...> <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...> <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], crs=EPSG:2263) - destinationGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...> <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...> <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....> <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...> ... <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...> <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...> <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...> <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], crs=EPSG:2263)
The dataset is created with two dimensions with a GeometryIndex.
taxi_trips.xindexes
Indexes:
payment_type PandasIndex
date PandasIndex
hour PandasIndex
origin GeometryIndex (crs=EPSG:2263)
destination GeometryIndex (crs=EPSG:2263)
Selection by geometry#
Geometry as a label#
You can select data based on geometry as with any other index, treating it as a label.
taxi_trips.sel(destination=[zones.geometry[0], zones.geometry[3]])
<xarray.Dataset>
Dimensions: (payment_type: 6, date: 40, hour: 24, origin: 252,
destination: 2)
Coordinates:
* payment_type (payment_type) <U11 'Credit card' 'Cash' ... 'Voided trip'
* date (date) object 2008-12-31 2009-01-01 ... 2022-04-06 2022-05-18
* hour (hour) int32 0 1 2 3 4 5 6 7 8 ... 15 16 17 18 19 20 21 22 23
* origin (origin) object POLYGON ((933100.9183527103 192536.0856972...
* destination (destination) object POLYGON ((933100.9183527103 192536.08...
Data variables:
trip_distance (payment_type, date, hour, origin, destination) float64 na...
trips_count (payment_type, date, hour, origin, destination) float64 na...
tip_amount (payment_type, date, hour, origin, destination) float64 na...
fare_amount (payment_type, date, hour, origin, destination) float64 na...
Indexes:
origin GeometryIndex (crs=EPSG:2263)
destination GeometryIndex (crs=EPSG:2263)- payment_type: 6
- date: 40
- hour: 24
- origin: 252
- destination: 2
- payment_type(payment_type)<U11'Credit card' ... 'Voided trip'
array(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='<U11') - date(date)object2008-12-31 ... 2022-05-18
array([datetime.date(2008, 12, 31), datetime.date(2009, 1, 1), datetime.date(2021, 12, 31), datetime.date(2022, 1, 1), datetime.date(2022, 1, 2), datetime.date(2022, 1, 3), datetime.date(2022, 1, 4), datetime.date(2022, 1, 5), datetime.date(2022, 1, 6), datetime.date(2022, 1, 7), datetime.date(2022, 1, 8), datetime.date(2022, 1, 9), datetime.date(2022, 1, 10), datetime.date(2022, 1, 11), datetime.date(2022, 1, 12), datetime.date(2022, 1, 13), datetime.date(2022, 1, 14), datetime.date(2022, 1, 15), datetime.date(2022, 1, 16), datetime.date(2022, 1, 17), datetime.date(2022, 1, 18), datetime.date(2022, 1, 19), datetime.date(2022, 1, 20), datetime.date(2022, 1, 21), datetime.date(2022, 1, 22), datetime.date(2022, 1, 23), datetime.date(2022, 1, 24), datetime.date(2022, 1, 25), datetime.date(2022, 1, 26), datetime.date(2022, 1, 27), datetime.date(2022, 1, 28), datetime.date(2022, 1, 29), datetime.date(2022, 1, 30), datetime.date(2022, 1, 31), datetime.date(2022, 2, 1), datetime.date(2022, 2, 22), datetime.date(2022, 3, 9), datetime.date(2022, 3, 15), datetime.date(2022, 4, 6), datetime.date(2022, 5, 18)], dtype=object) - hour(hour)int320 1 2 3 4 5 6 ... 18 19 20 21 22 23
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=int32) - origin(origin)objectPOLYGON ((933100.9183527103 1925...
- crs :
- EPSG:2263
array([<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...>, <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...>, <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....>, ..., <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...>, <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...>, <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], dtype=object) - destination(destination)objectPOLYGON ((933100.9183527103 1925...
- crs :
- EPSG:2263
array([<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...>, <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...>], dtype=object)
- trip_distance(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., ... ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]]]]]) - trips_count(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., ... ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]]]]]) - tip_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., ... ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]]]]]) - fare_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., ... ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]], [[nan, nan], [nan, nan], [nan, nan], ..., [nan, nan], [nan, nan], [nan, nan]]]]])
- payment_typePandasIndex
PandasIndex(Index(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='object', name='payment_type')) - datePandasIndex
PandasIndex(Index([2008-12-31, 2009-01-01, 2021-12-31, 2022-01-01, 2022-01-02, 2022-01-03, 2022-01-04, 2022-01-05, 2022-01-06, 2022-01-07, 2022-01-08, 2022-01-09, 2022-01-10, 2022-01-11, 2022-01-12, 2022-01-13, 2022-01-14, 2022-01-15, 2022-01-16, 2022-01-17, 2022-01-18, 2022-01-19, 2022-01-20, 2022-01-21, 2022-01-22, 2022-01-23, 2022-01-24, 2022-01-25, 2022-01-26, 2022-01-27, 2022-01-28, 2022-01-29, 2022-01-30, 2022-01-31, 2022-02-01, 2022-02-22, 2022-03-09, 2022-03-15, 2022-04-06, 2022-05-18], dtype='object', name='date')) - hourPandasIndex
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype='int32', name='hour')) - originGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...> <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...> <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....> <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...> ... <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...> <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...> <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...> <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], crs=EPSG:2263) - destinationGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...> <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...>], crs=EPSG:2263)
Nearest#
Alternatively, you can select based on the nearest neighbor Remember that all geometries, those in the index and those in the query, must use the same Coordinate Reference System..
taxi_trips.sel(
date=datetime.datetime(2022, 1, 28),
hour=12,
origin=[Point(1064321, 211194), Point(988669, 207721)],
destination=[Point(998142, 191215), Point(1010116, 42998)],
method="nearest",
)
<xarray.Dataset>
Dimensions: (payment_type: 6, origin: 2, destination: 2)
Coordinates:
* payment_type (payment_type) <U11 'Credit card' 'Cash' ... 'Voided trip'
date object 2022-01-28
hour int32 12
* origin (origin) object POLYGON ((1066997.4698108435 212947.336632...
* destination (destination) object POLYGON ((1000036.9036584795 194829.4...
Data variables:
trip_distance (payment_type, origin, destination) float64 nan nan ... nan
trips_count (payment_type, origin, destination) float64 nan nan ... nan
tip_amount (payment_type, origin, destination) float64 nan nan ... nan
fare_amount (payment_type, origin, destination) float64 nan nan ... nan
Indexes:
origin GeometryIndex (crs=EPSG:2263)
destination GeometryIndex (crs=EPSG:2263)- payment_type: 6
- origin: 2
- destination: 2
- payment_type(payment_type)<U11'Credit card' ... 'Voided trip'
array(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='<U11') - date()object2022-01-28
array(datetime.date(2022, 1, 28), dtype=object)
- hour()int3212
array(12, dtype=int32)
- origin(origin)objectPOLYGON ((1066997.4698108435 212...
- crs :
- EPSG:2263
array([<POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>, <POLYGON ((989131.643 205749.904, 989084.531 205727.63, 988423.538 206090.91...>], dtype=object) - destination(destination)objectPOLYGON ((1000036.9036584795 194...
- crs :
- EPSG:2263
array([<POLYGON ((1000036.904 194829.434, 1000276.454 194635.123, 1000351.303 19457...>, <POLYGON ((1021692.969 147138.664, 1021883.624 146696.103, 1022005.432 14673...>], dtype=object)
- trip_distance(payment_type, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]]]) - trips_count(payment_type, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]]]) - tip_amount(payment_type, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]]]) - fare_amount(payment_type, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]], [[nan, nan], [nan, nan]]])
- payment_typePandasIndex
PandasIndex(Index(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='object', name='payment_type')) - originGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...> <POLYGON ((989131.643 205749.904, 989084.531 205727.63, 988423.538 206090.91...>], crs=EPSG:2263) - destinationGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((1000036.904 194829.434, 1000276.454 194635.123, 1000351.303 19457...> <POLYGON ((1021692.969 147138.664, 1021883.624 146696.103, 1022005.432 14673...>], crs=EPSG:2263)
Spatial query#
Spatial-aware data selection using the “query” mode with a single geometry and a given predicate:
taxi_trips.sel(origin=box(998142, 191215, 1024321, 211194), method="intersects")
<xarray.Dataset>
Dimensions: (payment_type: 6, date: 40, hour: 24, origin: 25,
destination: 257)
Coordinates:
* payment_type (payment_type) <U11 'Credit card' 'Cash' ... 'Voided trip'
* date (date) object 2008-12-31 2009-01-01 ... 2022-04-06 2022-05-18
* hour (hour) int32 0 1 2 3 4 5 6 7 8 ... 15 16 17 18 19 20 21 22 23
* origin (origin) object POLYGON ((1000036.9036584795 194829.433560...
* destination (destination) object POLYGON ((933100.9183527103 192536.08...
Data variables:
trip_distance (payment_type, date, hour, origin, destination) float64 na...
trips_count (payment_type, date, hour, origin, destination) float64 na...
tip_amount (payment_type, date, hour, origin, destination) float64 na...
fare_amount (payment_type, date, hour, origin, destination) float64 na...
Indexes:
origin GeometryIndex (crs=EPSG:2263)
destination GeometryIndex (crs=EPSG:2263)- payment_type: 6
- date: 40
- hour: 24
- origin: 25
- destination: 257
- payment_type(payment_type)<U11'Credit card' ... 'Voided trip'
array(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='<U11') - date(date)object2008-12-31 ... 2022-05-18
array([datetime.date(2008, 12, 31), datetime.date(2009, 1, 1), datetime.date(2021, 12, 31), datetime.date(2022, 1, 1), datetime.date(2022, 1, 2), datetime.date(2022, 1, 3), datetime.date(2022, 1, 4), datetime.date(2022, 1, 5), datetime.date(2022, 1, 6), datetime.date(2022, 1, 7), datetime.date(2022, 1, 8), datetime.date(2022, 1, 9), datetime.date(2022, 1, 10), datetime.date(2022, 1, 11), datetime.date(2022, 1, 12), datetime.date(2022, 1, 13), datetime.date(2022, 1, 14), datetime.date(2022, 1, 15), datetime.date(2022, 1, 16), datetime.date(2022, 1, 17), datetime.date(2022, 1, 18), datetime.date(2022, 1, 19), datetime.date(2022, 1, 20), datetime.date(2022, 1, 21), datetime.date(2022, 1, 22), datetime.date(2022, 1, 23), datetime.date(2022, 1, 24), datetime.date(2022, 1, 25), datetime.date(2022, 1, 26), datetime.date(2022, 1, 27), datetime.date(2022, 1, 28), datetime.date(2022, 1, 29), datetime.date(2022, 1, 30), datetime.date(2022, 1, 31), datetime.date(2022, 2, 1), datetime.date(2022, 2, 22), datetime.date(2022, 3, 9), datetime.date(2022, 3, 15), datetime.date(2022, 4, 6), datetime.date(2022, 5, 18)], dtype=object) - hour(hour)int320 1 2 3 4 5 6 ... 18 19 20 21 22 23
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=int32) - origin(origin)objectPOLYGON ((1000036.9036584795 194...
- crs :
- EPSG:2263
array([<POLYGON ((1000036.904 194829.434, 1000276.454 194635.123, 1000351.303 19457...>, <POLYGON ((996576.041 197074.987, 996998.846 196888.229, 997072.038 196852.4...>, <POLYGON ((995798.638 199155.97, 996223.601 198955.826, 996593.391 198777.13...>, <POLYGON ((1003166.891 204533.535, 1003184.978 204525.533, 1003189.02 204527...>, <POLYGON ((994849.011 203499.267, 994911.093 203455.62, 994945.495 203480.52...>, <MULTIPOLYGON (((994780.815 203048.599, 994762.74 203007.932, 994653.164 203...>, <POLYGON ((1007155.288 188499.273, 1007196.203 188233.049, 1007236.656 18796...>, <POLYGON ((1002791.711 196025.081, 1002823.637 195870.711, 1002844.713 19576...>, <POLYGON ((1021274.466 188676.992, 1021327.596 188230.73, 1021331.05 188208....>, <POLYGON ((1027223.758 190451.926, 1027233.961 190424.606, 1027427.458 19049...>, <POLYGON ((1029152.314 197379.274, 1029238.052 197340.108, 1029320.949 19729...>, <POLYGON ((1024186.946 196680.736, 1024191.248 196669.177, 1024525.814 19680...>, <POLYGON ((1020363.929 203493.806, 1020474.447 203246.064, 1020479.329 20323...>, <POLYGON ((1026559.225 208467.841, 1026590.157 208421.453, 1026678.805 20843...>, <POLYGON ((1023709.727 204759.778, 1023685.76 204722.773, 1023769.992 204656...>, <POLYGON ((1020914.132 210389.714, 1020968.229 210109.38, 1020974.342 209669...>, <MULTIPOLYGON (((1024910.997 211398.662, 1025003.215 211165.06, 1025099.556 ...>, <POLYGON ((1024308.689 215593.594, 1024336.88 215576.347, 1024405.45 215590....>, <POLYGON ((1008497.035 195728.645, 1008702.936 195569.104, 1008399.18 195177...>, <POLYGON ((1009840.814 200650.771, 1010040.479 200623.556, 1010063.258 20062...>, <POLYGON ((1012564.178 207131.843, 1012819.309 207105.272, 1012930.388 20710...>, <POLYGON ((1009214.71 212202.191, 1009143.168 211988.871, 1008732.076 212054...>, <POLYGON ((1014422.557 210792.836, 1014532.616 210037.008, 1014775.896 21008...>, <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...>, <POLYGON ((999916.846 213275.139, 1000066.513 213189.698, 1000124.04 213157....>], dtype=object) - destination(destination)objectPOLYGON ((933100.9183527103 1925...
- crs :
- EPSG:2263
array([<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...>, <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...>, <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....>, ..., <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...>, <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...>, <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], dtype=object)
- trip_distance(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - trips_count(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - tip_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - fare_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ... ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]])
- payment_typePandasIndex
PandasIndex(Index(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='object', name='payment_type')) - datePandasIndex
PandasIndex(Index([2008-12-31, 2009-01-01, 2021-12-31, 2022-01-01, 2022-01-02, 2022-01-03, 2022-01-04, 2022-01-05, 2022-01-06, 2022-01-07, 2022-01-08, 2022-01-09, 2022-01-10, 2022-01-11, 2022-01-12, 2022-01-13, 2022-01-14, 2022-01-15, 2022-01-16, 2022-01-17, 2022-01-18, 2022-01-19, 2022-01-20, 2022-01-21, 2022-01-22, 2022-01-23, 2022-01-24, 2022-01-25, 2022-01-26, 2022-01-27, 2022-01-28, 2022-01-29, 2022-01-30, 2022-01-31, 2022-02-01, 2022-02-22, 2022-03-09, 2022-03-15, 2022-04-06, 2022-05-18], dtype='object', name='date')) - hourPandasIndex
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype='int32', name='hour')) - originGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((1000036.904 194829.434, 1000276.454 194635.123, 1000351.303 19457...> <POLYGON ((996576.041 197074.987, 996998.846 196888.229, 997072.038 196852.4...> <POLYGON ((995798.638 199155.97, 996223.601 198955.826, 996593.391 198777.13...> <POLYGON ((1003166.891 204533.535, 1003184.978 204525.533, 1003189.02 204527...> ... <POLYGON ((1009214.71 212202.191, 1009143.168 211988.871, 1008732.076 212054...> <POLYGON ((1014422.557 210792.836, 1014532.616 210037.008, 1014775.896 21008...> <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...> <POLYGON ((999916.846 213275.139, 1000066.513 213189.698, 1000124.04 213157....>], crs=EPSG:2263) - destinationGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...> <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...> <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....> <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...> ... <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...> <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...> <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...> <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], crs=EPSG:2263)
Spatial query using the sel() method with predicates other than "nearest" supports only scalar geometries as an input. If you want to query using an array of geometries, you can use the .xvec.query() method instead.
taxi_trips.xvec.query(
"origin", [Point(1064321, 211194), Point(1064321, 211194).buffer(500)]
)
<xarray.Dataset>
Dimensions: (payment_type: 6, date: 40, hour: 24, origin: 4,
destination: 257)
Coordinates:
* payment_type (payment_type) <U11 'Credit card' 'Cash' ... 'Voided trip'
* date (date) object 2008-12-31 2009-01-01 ... 2022-04-06 2022-05-18
* hour (hour) int32 0 1 2 3 4 5 6 7 8 ... 15 16 17 18 19 20 21 22 23
* origin (origin) object POLYGON ((1060888.899369195 212784.6402245...
* destination (destination) object POLYGON ((933100.9183527103 192536.08...
Data variables:
trip_distance (payment_type, date, hour, origin, destination) float64 na...
trips_count (payment_type, date, hour, origin, destination) float64 na...
tip_amount (payment_type, date, hour, origin, destination) float64 na...
fare_amount (payment_type, date, hour, origin, destination) float64 na...
Indexes:
origin GeometryIndex (crs=EPSG:2263)
destination GeometryIndex (crs=EPSG:2263)- payment_type: 6
- date: 40
- hour: 24
- origin: 4
- destination: 257
- payment_type(payment_type)<U11'Credit card' ... 'Voided trip'
array(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='<U11') - date(date)object2008-12-31 ... 2022-05-18
array([datetime.date(2008, 12, 31), datetime.date(2009, 1, 1), datetime.date(2021, 12, 31), datetime.date(2022, 1, 1), datetime.date(2022, 1, 2), datetime.date(2022, 1, 3), datetime.date(2022, 1, 4), datetime.date(2022, 1, 5), datetime.date(2022, 1, 6), datetime.date(2022, 1, 7), datetime.date(2022, 1, 8), datetime.date(2022, 1, 9), datetime.date(2022, 1, 10), datetime.date(2022, 1, 11), datetime.date(2022, 1, 12), datetime.date(2022, 1, 13), datetime.date(2022, 1, 14), datetime.date(2022, 1, 15), datetime.date(2022, 1, 16), datetime.date(2022, 1, 17), datetime.date(2022, 1, 18), datetime.date(2022, 1, 19), datetime.date(2022, 1, 20), datetime.date(2022, 1, 21), datetime.date(2022, 1, 22), datetime.date(2022, 1, 23), datetime.date(2022, 1, 24), datetime.date(2022, 1, 25), datetime.date(2022, 1, 26), datetime.date(2022, 1, 27), datetime.date(2022, 1, 28), datetime.date(2022, 1, 29), datetime.date(2022, 1, 30), datetime.date(2022, 1, 31), datetime.date(2022, 2, 1), datetime.date(2022, 2, 22), datetime.date(2022, 3, 9), datetime.date(2022, 3, 15), datetime.date(2022, 4, 6), datetime.date(2022, 5, 18)], dtype=object) - hour(hour)int320 1 2 3 4 5 6 ... 18 19 20 21 22 23
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=int32) - origin(origin)objectPOLYGON ((1060888.899369195 2127...
- crs :
- EPSG:2263
array([<POLYGON ((1060888.899 212784.64, 1061115.169 212224.782, 1061790.568 212559...>, <POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>, <POLYGON ((1060888.899 212784.64, 1061115.169 212224.782, 1061790.568 212559...>, <POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>], dtype=object) - destination(destination)objectPOLYGON ((933100.9183527103 1925...
- crs :
- EPSG:2263
array([<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...>, <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...>, <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....>, ..., <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...>, <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...>, <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], dtype=object)
- trip_distance(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ... [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - trips_count(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ... [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - tip_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ... [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - fare_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ... [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]])
- payment_typePandasIndex
PandasIndex(Index(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='object', name='payment_type')) - datePandasIndex
PandasIndex(Index([2008-12-31, 2009-01-01, 2021-12-31, 2022-01-01, 2022-01-02, 2022-01-03, 2022-01-04, 2022-01-05, 2022-01-06, 2022-01-07, 2022-01-08, 2022-01-09, 2022-01-10, 2022-01-11, 2022-01-12, 2022-01-13, 2022-01-14, 2022-01-15, 2022-01-16, 2022-01-17, 2022-01-18, 2022-01-19, 2022-01-20, 2022-01-21, 2022-01-22, 2022-01-23, 2022-01-24, 2022-01-25, 2022-01-26, 2022-01-27, 2022-01-28, 2022-01-29, 2022-01-30, 2022-01-31, 2022-02-01, 2022-02-22, 2022-03-09, 2022-03-15, 2022-04-06, 2022-05-18], dtype='object', name='date')) - hourPandasIndex
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype='int32', name='hour')) - originGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((1060888.899 212784.64, 1061115.169 212224.782, 1061790.568 212559...> <POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...> <POLYGON ((1060888.899 212784.64, 1061115.169 212224.782, 1061790.568 212559...> <POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>], crs=EPSG:2263) - destinationGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...> <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...> <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....> <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...> ... <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...> <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...> <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...> <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], crs=EPSG:2263)
.xvec.query() is a wrapper around shapely.STRtree and returns the subset of the original object where the bounding box of each input geometry intersects the bounding box of geometry in a GeometryIndex. If a predicate is provided, the tree geometries are first queried based on the bounding box of the input geometry. Then they are further filtered to those that meet the predicate when comparing the input geometry to the tree geometry: predicate(geometry, index_geometry).
taxi_trips.xvec.query(
"origin",
[Point(1064321, 211194), Point(1064321, 211194).buffer(500)],
predicate="within",
)
<xarray.Dataset>
Dimensions: (payment_type: 6, date: 40, hour: 24, origin: 2,
destination: 257)
Coordinates:
* payment_type (payment_type) <U11 'Credit card' 'Cash' ... 'Voided trip'
* date (date) object 2008-12-31 2009-01-01 ... 2022-04-06 2022-05-18
* hour (hour) int32 0 1 2 3 4 5 6 7 8 ... 15 16 17 18 19 20 21 22 23
* origin (origin) object POLYGON ((1066997.4698108435 212947.336632...
* destination (destination) object POLYGON ((933100.9183527103 192536.08...
Data variables:
trip_distance (payment_type, date, hour, origin, destination) float64 na...
trips_count (payment_type, date, hour, origin, destination) float64 na...
tip_amount (payment_type, date, hour, origin, destination) float64 na...
fare_amount (payment_type, date, hour, origin, destination) float64 na...
Indexes:
origin GeometryIndex (crs=EPSG:2263)
destination GeometryIndex (crs=EPSG:2263)- payment_type: 6
- date: 40
- hour: 24
- origin: 2
- destination: 257
- payment_type(payment_type)<U11'Credit card' ... 'Voided trip'
array(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='<U11') - date(date)object2008-12-31 ... 2022-05-18
array([datetime.date(2008, 12, 31), datetime.date(2009, 1, 1), datetime.date(2021, 12, 31), datetime.date(2022, 1, 1), datetime.date(2022, 1, 2), datetime.date(2022, 1, 3), datetime.date(2022, 1, 4), datetime.date(2022, 1, 5), datetime.date(2022, 1, 6), datetime.date(2022, 1, 7), datetime.date(2022, 1, 8), datetime.date(2022, 1, 9), datetime.date(2022, 1, 10), datetime.date(2022, 1, 11), datetime.date(2022, 1, 12), datetime.date(2022, 1, 13), datetime.date(2022, 1, 14), datetime.date(2022, 1, 15), datetime.date(2022, 1, 16), datetime.date(2022, 1, 17), datetime.date(2022, 1, 18), datetime.date(2022, 1, 19), datetime.date(2022, 1, 20), datetime.date(2022, 1, 21), datetime.date(2022, 1, 22), datetime.date(2022, 1, 23), datetime.date(2022, 1, 24), datetime.date(2022, 1, 25), datetime.date(2022, 1, 26), datetime.date(2022, 1, 27), datetime.date(2022, 1, 28), datetime.date(2022, 1, 29), datetime.date(2022, 1, 30), datetime.date(2022, 1, 31), datetime.date(2022, 2, 1), datetime.date(2022, 2, 22), datetime.date(2022, 3, 9), datetime.date(2022, 3, 15), datetime.date(2022, 4, 6), datetime.date(2022, 5, 18)], dtype=object) - hour(hour)int320 1 2 3 4 5 6 ... 18 19 20 21 22 23
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=int32) - origin(origin)objectPOLYGON ((1066997.4698108435 212...
- crs :
- EPSG:2263
array([<POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>, <POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>], dtype=object) - destination(destination)objectPOLYGON ((933100.9183527103 1925...
- crs :
- EPSG:2263
array([<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...>, <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...>, <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....>, ..., <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...>, <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...>, <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], dtype=object)
- trip_distance(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], ... [[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - trips_count(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], ... [[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - tip_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], ... [[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - fare_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]], ... [[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]])
- payment_typePandasIndex
PandasIndex(Index(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='object', name='payment_type')) - datePandasIndex
PandasIndex(Index([2008-12-31, 2009-01-01, 2021-12-31, 2022-01-01, 2022-01-02, 2022-01-03, 2022-01-04, 2022-01-05, 2022-01-06, 2022-01-07, 2022-01-08, 2022-01-09, 2022-01-10, 2022-01-11, 2022-01-12, 2022-01-13, 2022-01-14, 2022-01-15, 2022-01-16, 2022-01-17, 2022-01-18, 2022-01-19, 2022-01-20, 2022-01-21, 2022-01-22, 2022-01-23, 2022-01-24, 2022-01-25, 2022-01-26, 2022-01-27, 2022-01-28, 2022-01-29, 2022-01-30, 2022-01-31, 2022-02-01, 2022-02-22, 2022-03-09, 2022-03-15, 2022-04-06, 2022-05-18], dtype='object', name='date')) - hourPandasIndex
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype='int32', name='hour')) - originGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...> <POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>], crs=EPSG:2263) - destinationGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...> <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...> <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....> <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...> ... <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...> <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...> <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...> <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], crs=EPSG:2263)
Since multiple query geometries may return the same index geometry, the method, by default, returns duplicated observations. That can be filtered by passing unique=True.
taxi_trips.xvec.query(
"origin",
[Point(1064321, 211194), Point(1064321, 211194).buffer(500)],
predicate="within",
unique=True,
)
<xarray.Dataset>
Dimensions: (payment_type: 6, date: 40, hour: 24, origin: 1,
destination: 257)
Coordinates:
* payment_type (payment_type) <U11 'Credit card' 'Cash' ... 'Voided trip'
* date (date) object 2008-12-31 2009-01-01 ... 2022-04-06 2022-05-18
* hour (hour) int32 0 1 2 3 4 5 6 7 8 ... 15 16 17 18 19 20 21 22 23
* origin (origin) object POLYGON ((1066997.4698108435 212947.336632...
* destination (destination) object POLYGON ((933100.9183527103 192536.08...
Data variables:
trip_distance (payment_type, date, hour, origin, destination) float64 na...
trips_count (payment_type, date, hour, origin, destination) float64 na...
tip_amount (payment_type, date, hour, origin, destination) float64 na...
fare_amount (payment_type, date, hour, origin, destination) float64 na...
Indexes:
origin GeometryIndex (crs=EPSG:2263)
destination GeometryIndex (crs=EPSG:2263)- payment_type: 6
- date: 40
- hour: 24
- origin: 1
- destination: 257
- payment_type(payment_type)<U11'Credit card' ... 'Voided trip'
array(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='<U11') - date(date)object2008-12-31 ... 2022-05-18
array([datetime.date(2008, 12, 31), datetime.date(2009, 1, 1), datetime.date(2021, 12, 31), datetime.date(2022, 1, 1), datetime.date(2022, 1, 2), datetime.date(2022, 1, 3), datetime.date(2022, 1, 4), datetime.date(2022, 1, 5), datetime.date(2022, 1, 6), datetime.date(2022, 1, 7), datetime.date(2022, 1, 8), datetime.date(2022, 1, 9), datetime.date(2022, 1, 10), datetime.date(2022, 1, 11), datetime.date(2022, 1, 12), datetime.date(2022, 1, 13), datetime.date(2022, 1, 14), datetime.date(2022, 1, 15), datetime.date(2022, 1, 16), datetime.date(2022, 1, 17), datetime.date(2022, 1, 18), datetime.date(2022, 1, 19), datetime.date(2022, 1, 20), datetime.date(2022, 1, 21), datetime.date(2022, 1, 22), datetime.date(2022, 1, 23), datetime.date(2022, 1, 24), datetime.date(2022, 1, 25), datetime.date(2022, 1, 26), datetime.date(2022, 1, 27), datetime.date(2022, 1, 28), datetime.date(2022, 1, 29), datetime.date(2022, 1, 30), datetime.date(2022, 1, 31), datetime.date(2022, 2, 1), datetime.date(2022, 2, 22), datetime.date(2022, 3, 9), datetime.date(2022, 3, 15), datetime.date(2022, 4, 6), datetime.date(2022, 5, 18)], dtype=object) - hour(hour)int320 1 2 3 4 5 6 ... 18 19 20 21 22 23
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=int32) - origin(origin)objectPOLYGON ((1066997.4698108435 212...
- crs :
- EPSG:2263
array([<POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>], dtype=object) - destination(destination)objectPOLYGON ((933100.9183527103 1925...
- crs :
- EPSG:2263
array([<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...>, <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...>, <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....>, ..., <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...>, <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...>, <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], dtype=object)
- trip_distance(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ... [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]]]]) - trips_count(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ... [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]]]]) - tip_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ... [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]]]]) - fare_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ... [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]], [[[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan]]]]])
- payment_typePandasIndex
PandasIndex(Index(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='object', name='payment_type')) - datePandasIndex
PandasIndex(Index([2008-12-31, 2009-01-01, 2021-12-31, 2022-01-01, 2022-01-02, 2022-01-03, 2022-01-04, 2022-01-05, 2022-01-06, 2022-01-07, 2022-01-08, 2022-01-09, 2022-01-10, 2022-01-11, 2022-01-12, 2022-01-13, 2022-01-14, 2022-01-15, 2022-01-16, 2022-01-17, 2022-01-18, 2022-01-19, 2022-01-20, 2022-01-21, 2022-01-22, 2022-01-23, 2022-01-24, 2022-01-25, 2022-01-26, 2022-01-27, 2022-01-28, 2022-01-29, 2022-01-30, 2022-01-31, 2022-02-01, 2022-02-22, 2022-03-09, 2022-03-15, 2022-04-06, 2022-05-18], dtype='object', name='date')) - hourPandasIndex
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype='int32', name='hour')) - originGeometryIndex (crs=EPSG:2263)
GeometryIndex([<POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>], crs=EPSG:2263)
- destinationGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...> <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...> <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....> <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...> ... <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...> <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...> <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...> <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], crs=EPSG:2263)
When using a predicate "dwithin" (search for geometries within a set distance) you can also pass the distance argument.
taxi_trips.xvec.query(
"origin",
[Point(1064321, 211194), Point(1064321, 211194).buffer(500)],
predicate="dwithin",
unique=True,
distance=5000,
)
<xarray.Dataset>
Dimensions: (payment_type: 6, date: 40, hour: 24, origin: 3,
destination: 257)
Coordinates:
* payment_type (payment_type) <U11 'Credit card' 'Cash' ... 'Voided trip'
* date (date) object 2008-12-31 2009-01-01 ... 2022-04-06 2022-05-18
* hour (hour) int32 0 1 2 3 4 5 6 7 8 ... 15 16 17 18 19 20 21 22 23
* origin (origin) object POLYGON ((1060888.899369195 212784.6402245...
* destination (destination) object POLYGON ((933100.9183527103 192536.08...
Data variables:
trip_distance (payment_type, date, hour, origin, destination) float64 na...
trips_count (payment_type, date, hour, origin, destination) float64 na...
tip_amount (payment_type, date, hour, origin, destination) float64 na...
fare_amount (payment_type, date, hour, origin, destination) float64 na...
Indexes:
origin GeometryIndex (crs=EPSG:2263)
destination GeometryIndex (crs=EPSG:2263)- payment_type: 6
- date: 40
- hour: 24
- origin: 3
- destination: 257
- payment_type(payment_type)<U11'Credit card' ... 'Voided trip'
array(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='<U11') - date(date)object2008-12-31 ... 2022-05-18
array([datetime.date(2008, 12, 31), datetime.date(2009, 1, 1), datetime.date(2021, 12, 31), datetime.date(2022, 1, 1), datetime.date(2022, 1, 2), datetime.date(2022, 1, 3), datetime.date(2022, 1, 4), datetime.date(2022, 1, 5), datetime.date(2022, 1, 6), datetime.date(2022, 1, 7), datetime.date(2022, 1, 8), datetime.date(2022, 1, 9), datetime.date(2022, 1, 10), datetime.date(2022, 1, 11), datetime.date(2022, 1, 12), datetime.date(2022, 1, 13), datetime.date(2022, 1, 14), datetime.date(2022, 1, 15), datetime.date(2022, 1, 16), datetime.date(2022, 1, 17), datetime.date(2022, 1, 18), datetime.date(2022, 1, 19), datetime.date(2022, 1, 20), datetime.date(2022, 1, 21), datetime.date(2022, 1, 22), datetime.date(2022, 1, 23), datetime.date(2022, 1, 24), datetime.date(2022, 1, 25), datetime.date(2022, 1, 26), datetime.date(2022, 1, 27), datetime.date(2022, 1, 28), datetime.date(2022, 1, 29), datetime.date(2022, 1, 30), datetime.date(2022, 1, 31), datetime.date(2022, 2, 1), datetime.date(2022, 2, 22), datetime.date(2022, 3, 9), datetime.date(2022, 3, 15), datetime.date(2022, 4, 6), datetime.date(2022, 5, 18)], dtype=object) - hour(hour)int320 1 2 3 4 5 6 ... 18 19 20 21 22 23
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype=int32) - origin(origin)objectPOLYGON ((1060888.899369195 2127...
- crs :
- EPSG:2263
array([<POLYGON ((1060888.899 212784.64, 1061115.169 212224.782, 1061790.568 212559...>, <POLYGON ((1055115.955 223059.396, 1055834.3 222504.791, 1056049.506 222353....>, <POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>], dtype=object) - destination(destination)objectPOLYGON ((933100.9183527103 1925...
- crs :
- EPSG:2263
array([<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...>, <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...>, <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....>, ..., <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...>, <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...>, <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], dtype=object)
- trip_distance(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ... [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - trips_count(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ... [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - tip_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ... [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]]) - fare_amount(payment_type, date, hour, origin, destination)float64nan nan nan nan ... nan nan nan nan
array([[[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ... [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], ..., [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]], [[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]]])
- payment_typePandasIndex
PandasIndex(Index(['Credit card', 'Cash', 'No charge', 'Dispute', 'Unknown', 'Voided trip'], dtype='object', name='payment_type')) - datePandasIndex
PandasIndex(Index([2008-12-31, 2009-01-01, 2021-12-31, 2022-01-01, 2022-01-02, 2022-01-03, 2022-01-04, 2022-01-05, 2022-01-06, 2022-01-07, 2022-01-08, 2022-01-09, 2022-01-10, 2022-01-11, 2022-01-12, 2022-01-13, 2022-01-14, 2022-01-15, 2022-01-16, 2022-01-17, 2022-01-18, 2022-01-19, 2022-01-20, 2022-01-21, 2022-01-22, 2022-01-23, 2022-01-24, 2022-01-25, 2022-01-26, 2022-01-27, 2022-01-28, 2022-01-29, 2022-01-30, 2022-01-31, 2022-02-01, 2022-02-22, 2022-03-09, 2022-03-15, 2022-04-06, 2022-05-18], dtype='object', name='date')) - hourPandasIndex
PandasIndex(Index([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23], dtype='int32', name='hour')) - originGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((1060888.899 212784.64, 1061115.169 212224.782, 1061790.568 212559...> <POLYGON ((1055115.955 223059.396, 1055834.3 222504.791, 1056049.506 222353....> <POLYGON ((1066997.47 212947.337, 1067047.985 212271.203, 1067125.553 211617...>], crs=EPSG:2263) - destinationGeometryIndex (crs=EPSG:2263)
GeometryIndex( [<POLYGON ((933100.918 192536.086, 933091.011 192572.175, 933088.585 192604.9...> <MULTIPOLYGON (((1020447.262 151454.148, 1020359.08 151573.453, 1020263.117 ...> <POLYGON ((1026308.77 256767.698, 1026495.593 256638.616, 1026567.23 256589....> <POLYGON ((992073.467 203714.076, 992068.667 203711.502, 992061.716 203711.7...> ... <POLYGON ((1011466.966 216463.005, 1011545.889 216046.871, 1011571.962 21605...> <POLYGON ((980555.204 196138.486, 980570.792 195964.592, 980315.025 195970.7...> <MULTIPOLYGON (((999824.883 224487.553, 999867.093 224500.855, 999884.647 22...> <POLYGON ((997493.323 220912.386, 997355.264 220664.404, 996698.728 221027.4...>], crs=EPSG:2263)
Masking variable geometry#
Having a variable geometry, .xvec.mask can be used to mask the DataArray as all the input geometries are queried against all those in the DataArray.
glaciers_df = gpd.read_file("https://github.com/loreabad6/post/raw/refs/heads/main/inst/extdata/svalbard.gpkg")
glaciers = (
glaciers_df.set_index(["name", "year"])
.to_xarray()
.proj.assign_crs(spatial_ref=glaciers_df.crs) # use xproj to store the CRS information
)
glaciers
<xarray.Dataset> Size: 432B
Dimensions: (name: 5, year: 3)
Coordinates:
* name (name) object 40B 'Austre Brøggerbreen' ... 'Steenbreen'
* year (year) float64 24B 1.936e+03 1.99e+03 2.007e+03
* spatial_ref int64 8B 0
Data variables:
length (name, year) float64 120B 5.808e+03 5.265e+03 ... 1.819e+03
fwidth (name, year) float64 120B 1.254e+03 470.1 888.4 ... 279.4 202.6
geometry (name, year) object 120B POLYGON ((432375.11039999966 876165...
Indexes:
spatial_ref CRSIndex (crs=EPSG:32633)- name: 5
- year: 3
- name(name)object'Austre Brøggerbreen' ... 'Steen...
array(['Austre Brøggerbreen', 'Austre Lovenbreen', 'Edithbreen', 'Midtre Lovenbreen', 'Steenbreen'], dtype=object) - year(year)float641.936e+03 1.99e+03 2.007e+03
array([1936., 1990., 2007.])
- spatial_ref()int640
array(0)
- length(name, year)float645.808e+03 5.265e+03 ... 1.819e+03
array([[5808.299579, 5265.479773, 4886.823854], [4655.052772, 3751.333493, 3441.045428], [5029.248703, 4335.833494, 4099.017543], [4381.553508, 3772.578735, 3566.190758], [2279.163326, 1928.610561, 1818.691555]]) - fwidth(name, year)float641.254e+03 470.1 ... 279.4 202.6
array([[1254.324348, 470.127452, 888.406938], [1215.71829 , 415.68072 , 552.597872], [ 465.073176, 428.611163, 414.354343], [1448.993086, 778.749748, 698.536216], [ 208.142946, 279.3527 , 202.641662]]) - geometry(name, year)objectPOLYGON ((432375.11039999966 876...
array([[<POLYGON ((432375.11 8761657.493, 432374.075 8761556.916, 432359.094 8761317...>, <POLYGON ((432827.911 8760071.745, 432927.89 8760033.48, 432969.69 8760016.4...>, <POLYGON ((430859.331 8760810.068, 430859.331 8760955.06, 430896.373 8761011...>], [<POLYGON ((439050.076 8760008.254, 439165.072 8760005.234, 439280.066 875999...>, <POLYGON ((439017.009 8757383.527, 439021.079 8757374.342, 439034.263 875736...>, <POLYGON ((438084.33 8758585.89, 438085.213 8758599.512, 438145.315 8758587....>], [<POLYGON ((436612.09 8756035.361, 436709.062 8755987.852, 437034.077 8755985...>, <POLYGON ((437656.357 8755886.121, 437657.59 8755885.91, 437673.31 8755886.3...>, <POLYGON ((435895.989 8755828.511, 435907.3 8755836.48, 435950.48 8755865.38...>], [<MULTIPOLYGON (((437133.738 8760867.133, 437249.396 8760812.577, 437437.069 ...>, <MULTIPOLYGON (((437126.101 8759081.666, 437072.097 8759019.568, 437031.087 ...>, <MULTIPOLYGON (((436343.669 8759786.422, 436396.593 8759874.629, 436425.673 ...>], [<POLYGON ((433846.077 8756899.896, 434061.106 8756825.509, 434270.08 8756730...>, <POLYGON ((433722.085 8756919.211, 433772.071 8756919.991, 433782.204 875692...>, <POLYGON ((433151.47 8756599.749, 433235.124 8756697.643, 433284.108 8756749...>]], dtype=object)
- namePandasIndex
PandasIndex(Index(['Austre Brøggerbreen', 'Austre Lovenbreen', 'Edithbreen', 'Midtre Lovenbreen', 'Steenbreen'], dtype='object', name='name')) - yearPandasIndex
PandasIndex(Index([1936.0, 1990.0, 2007.0], dtype='float64', name='year'))
- spatial_refCRSIndex (crs=EPSG:32633)
CRSIndex <Projected CRS: EPSG:32633> Name: WGS 84 / UTM zone 33N Axis Info [cartesian]: - E[east]: Easting (metre) - N[north]: Northing (metre) Area of Use: - name: Between 12°E and 18°E, northern hemisphere between equator and 84°N, onshore and offshore. Austria. Bosnia and Herzegovina. Cameroon. Central African Republic. Chad. Congo. Croatia. Czechia. Democratic Republic of the Congo (Zaire). Gabon. Germany. Hungary. Italy. Libya. Malta. Niger. Nigeria. Norway. Poland. San Marino. Slovakia. Slovenia. Svalbard. Sweden. Vatican City State. - bounds: (12.0, 0.0, 18.0, 84.0) Coordinate Operation: - name: UTM zone 33N - method: Transverse Mercator Datum: World Geodetic System 1984 ensemble - Ellipsoid: WGS 84 - Prime Meridian: Greenwich
mask = glaciers.geometry.xvec.mask(geometry=Point((437400.506, 8755169.942)))
mask
This can be used to subset the original Dataset.
glaciers.where(mask)
<xarray.Dataset> Size: 432B
Dimensions: (name: 5, year: 3)
Coordinates:
* name (name) object 40B 'Austre Brøggerbreen' ... 'Steenbreen'
* year (year) float64 24B 1.936e+03 1.99e+03 2.007e+03
* spatial_ref int64 8B 0
Data variables:
length (name, year) float64 120B nan nan nan nan ... nan nan nan nan
fwidth (name, year) float64 120B nan nan nan nan ... nan nan nan nan
geometry (name, year) object 120B nan nan nan nan ... nan nan nan nan
Indexes:
spatial_ref CRSIndex (crs=EPSG:32633)- name: 5
- year: 3
- name(name)object'Austre Brøggerbreen' ... 'Steen...
array(['Austre Brøggerbreen', 'Austre Lovenbreen', 'Edithbreen', 'Midtre Lovenbreen', 'Steenbreen'], dtype=object) - year(year)float641.936e+03 1.99e+03 2.007e+03
array([1936., 1990., 2007.])
- spatial_ref()int640
array(0)
- length(name, year)float64nan nan nan nan ... nan nan nan nan
array([[ nan, nan, nan], [ nan, nan, nan], [5029.248703, 4335.833494, 4099.017543], [ nan, nan, nan], [ nan, nan, nan]]) - fwidth(name, year)float64nan nan nan nan ... nan nan nan nan
array([[ nan, nan, nan], [ nan, nan, nan], [465.073176, 428.611163, 414.354343], [ nan, nan, nan], [ nan, nan, nan]]) - geometry(name, year)objectnan nan nan nan ... nan nan nan nan
array([[nan, nan, nan], [nan, nan, nan], [<POLYGON ((436612.09 8756035.361, 436709.062 8755987.852, 437034.077 8755985...>, <POLYGON ((437656.357 8755886.121, 437657.59 8755885.91, 437673.31 8755886.3...>, <POLYGON ((435895.989 8755828.511, 435907.3 8755836.48, 435950.48 8755865.38...>], [nan, nan, nan], [nan, nan, nan]], dtype=object)
- namePandasIndex
PandasIndex(Index(['Austre Brøggerbreen', 'Austre Lovenbreen', 'Edithbreen', 'Midtre Lovenbreen', 'Steenbreen'], dtype='object', name='name')) - yearPandasIndex
PandasIndex(Index([1936.0, 1990.0, 2007.0], dtype='float64', name='year'))
- spatial_refCRSIndex (crs=EPSG:32633)
CRSIndex <Projected CRS: EPSG:32633> Name: WGS 84 / UTM zone 33N Axis Info [cartesian]: - E[east]: Easting (metre) - N[north]: Northing (metre) Area of Use: - name: Between 12°E and 18°E, northern hemisphere between equator and 84°N, onshore and offshore. Austria. Bosnia and Herzegovina. Cameroon. Central African Republic. Chad. Congo. Croatia. Czechia. Democratic Republic of the Congo (Zaire). Gabon. Germany. Hungary. Italy. Libya. Malta. Niger. Nigeria. Norway. Poland. San Marino. Slovakia. Slovenia. Svalbard. Sweden. Vatican City State. - bounds: (12.0, 0.0, 18.0, 84.0) Coordinate Operation: - name: UTM zone 33N - method: Transverse Mercator Datum: World Geodetic System 1984 ensemble - Ellipsoid: WGS 84 - Prime Meridian: Greenwich