xray at SciPy 2015
-
Upload
stephan-hoyer -
Category
Technology
-
view
700 -
download
0
Transcript of xray at SciPy 2015
![Page 1: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/1.jpg)
![Page 2: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/2.jpg)
![Page 3: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/3.jpg)
●●●●● la
titud
e
longitude
time
![Page 4: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/4.jpg)
●●●●●●●
![Page 5: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/5.jpg)
●●●
![Page 6: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/6.jpg)
DataArray
○○○○
Dataset
○ DataArray○
![Page 7: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/7.jpg)
time
longitude
latitude
land_coverelevation
![Page 8: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/8.jpg)
>>> ds
<xray.Dataset>
Dimensions: (time: 10, latitude: 8, longitude: 8)
Coordinates:
* time (time) datetime64 2015-01-01 2015-01-02 2015-01-03 2015-01-04 ...
* latitude (latitude) float64 50.0 47.5 45.0 42.5 40.0 37.5 35.0 32.5
* longitude (longitude) float64 -105.0 -102.5 -100.0 -97.5 -95.0 -92.5 ...
elevation (longitude, latitude) int64 201 231 582 239 1848 1004 1004 ...
land_cover (longitude, latitude) object 'forest' 'urban' 'farmland'...
Data variables:
temperature (time, longitude, latitude) float64 13.7 8.031 18.36 24.95 ...
pressure (time, longitude, latitude) float64 1.374 1.142 1.388 0.9992 ...
![Page 9: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/9.jpg)
# numpy style
ds.temperature[0, :, :]
# pandas style
ds.temperature.loc[:, -90, 50]
# with dimension names
ds.sel(time='2015-01-01')
ds.sel(longitude=-90, latitude=50, method='nearest')
![Page 10: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/10.jpg)
# math
(10 + ds) ** 0.5
ds.temperature + ds.pressure
np.sin(ds.temperature)
# aggregation
ds.mean(dim='time')
ds.max(dim=['latitude', 'longitude'])
![Page 11: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/11.jpg)
time tim
e
space
space+ =
Result has the union of all dimension names
![Page 12: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/12.jpg)
year + =
Result has the intersection of coordinate labels
200020012002200320042005200620072008
XX
X
XX
![Page 13: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/13.jpg)
# average by calendar month
ds.groupby('time.month').mean('time')
# resample to every 10 days
ds.resample('10D', dim='time', how='max')
![Page 14: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/14.jpg)
# xray -> numpy
ds.temperature.values
# xray -> pandas
ds.to_dataframe()
# pandas -> xray
xray.Dataset.from_dataframe(df)
![Page 15: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/15.jpg)
>>> ds = xray.open_mfdataset('/Users/shoyer/data/era-interim/*.nc')
>>> ds
<xray.Dataset>
Dimensions: (latitude: 256, longitude: 512, time: 52596)
Coordinates:
* latitude (latitude) float32 89.4628 88.767 88.067 87.3661 86.6648 ...
* longitude (longitude) float32 0.0 0.703125 1.40625 2.10938 2.8125 ...
* time (time) datetime64[ns] 1979-01-01 1979-01-01T06:00:00 ...
Data variables:
t2m (time, latitude, longitude) float64 240.6 240.6 240.6 ...
>>> ds.nbytes * (2 ** -30)
51.363675981760025
![Page 16: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/16.jpg)
ds_by_season = ds.groupby('time.season').mean('time')
t2m_range = abs(ds_by_season.sel(season='JJA')
- ds_by_season.sel(season='DJF')).t2m
%time result = t2m_range.load()
CPU times: user 2min 1s, sys: 49.5 s, total: 2min 51s
Wall time: 38.6 s
![Page 17: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/17.jpg)
More details: continuum.io/blog/xray-dask
![Page 18: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/18.jpg)
![Page 19: xray at SciPy 2015](https://reader033.fdocuments.us/reader033/viewer/2022042615/55d6c8bfbb61ebf96a8b464f/html5/thumbnails/19.jpg)
pandas: indexing, factorizeNumPy: arraysnetCDF4, h5py, SciPy: IOdask.array: out of core arrays