covsirphy.gis package

covsirphy.gis.gis module

class GIS(layers=None, country='ISO3', date='Date', **kwargs)[source]

Bases: Term

Class of geographic information system to handle geo-spatial time-series data.

Parameters:
  • layers (list[str] or None) – list of layers of geographic information or None ([“ISO3”, “Province”, “City”])

  • country (str or None) – layer name of countries or None (countries are not included in the layers)

  • date (str) – column name of observation dates

Raises:

ValueError – @layers has duplicates

Note

Country level data specified with @country will be stored with ISO3 codes.

all(variables=None, errors='raise')[source]

Return all available data.

Parameters:
  • variables (list[str] or None) – list of variables to collect or None (all available variables)

  • errors (str) – ‘raise’ or ‘coerce’

Raises:

NotRegisteredError – No records have been registered yet

Returns:

pandas.DataFrame

Index

reset index

Columns
  • (pandas.Category): columns defined by covsirphy.GIS(layers)

  • (pandas.Timestamp): observation dates, column defined by covsirphy.GIS(date)

  • columns defined by @variables

classmethod area_name(geo=None)[source]

Return area name of the geographic information, like ‘Japan’, ‘Tokyo/Japan’, ‘Japan_UK’, ‘the world’.

Parameters:

geo (tuple(list[str] or tuple(str) or str) or str or None) – location names

Returns:

str – area name

choropleth(variable, filename, title='Choropleth map', logscale=True, **kwargs)[source]

Create choropleth map.

Parameters:
  • variable (str) – variable name to show

  • filename (str or None) – filename to save the figure or None (display)

  • title (str) – title of the map

  • logscale (bool) – whether convert the value to log10 scale values or not

  • kwargs – keyword arguments of the following classes and methods. - covsirphy.GIS.to_geopandas() except for @variables, - matplotlib.pyplot.savefig(), matplotlib.pyplot.legend(), and - pandas.DataFrame.plot()

citations(variables=None)[source]

Return citation list of the secondary data sources.

Parameters:

variables (list[str] or None) – list of variables to collect or None (all available variables)

Returns:

list[str] – citation list

layer(geo=None, start_date=None, end_date=None, variables=None, errors='raise')[source]

Return the data at the selected layer in the date range.

Parameters:
  • geo (tuple(list[str] or tuple(str) or str) or str or None) – location names to specify the layer or None (the top level)

  • start_date (str or None) – start date, like 22Jan2020

  • end_date (str or None) – end date, like 01Feb2020

  • variables (list[str] or None) – list of variables to add or None (all available columns)

  • errors (str) – whether raise errors or not, ‘raise’ or ‘coerce’

Raises:
Returns:

pandas.DataFrame

Index:

reset index

Columns
  • (str): columns defined by covsirphy.GIS(layers)

  • (pandas.Timestamp): observation dates, column defined by covsirphy.GIS(date)

  • columns defined by @variables

Note

Note that records with NAs as country names will be always removed.

Note

When geo=None or geo=(None,), returns country-level data, assuming we have country/province/city as layers here.

Note

When geo=(“Japan”,) or geo=”Japan”, returns province-level data in Japan.

Note

When geo=([“Japan”, “UK”],), returns province-level data in Japan and UK.

Note

When geo=(“Japan”, “Kanagawa”), returns city-level data in Kanagawa/Japan.

Note

When geo=(“Japan”, [“Tokyo”, “Kanagawa”]), returns city-level data in Tokyo/Japan and Kanagawa/Japan.

register(data, layers=None, date='Date', variables=None, citations=None, convert_iso3=True, **kwargs)[source]

Register new data.

Parameters:
  • data (pandas.DataFrame) –

    new data Index

    reset index

    Columns
    • columns defined by @layers

    • column defined by @date

    • columns defined by @variables

  • layers (list[str] or None) – layers of the data or None (as the same as covsirphy.GIS(layer))

  • date (str) – column name of observation dates of the data

  • variables (list[str] or None) – list of variables to add or None (all available columns)

  • citations (list[str] or str or None) – citations of the dataset or None ([“my own dataset”])

  • convert_iso3 (bool) – whether convert country names to ISO3 codes or not

  • **kwargs – keyword arguments of pandas.to_datetime() including “dayfirst (bool): whether date format is DD/MM or not”

Raises:

ValueError – @data_layers has duplicates

Returns:

covsirphy.GIS – self

subset(geo=None, start_date=None, end_date=None, variables=None, errors='raise')[source]

Return subset of the location and date range.

Parameters:
  • geo (tuple(list[str] or tuple(str) or str) or str or None) – location names to filter or None (total at the top level)

  • start_date (str or None) – start date, like 22Jan2020

  • end_date (str or None) – end date, like 01Feb2020

  • variables (list[str] or None) – list of variables to add or None (all available columns)

  • errors (str) – whether raise errors or not, ‘raise’ or ‘coerce’

Raises:
Returns:

pandas.DataFrame

Index:

reset index

Columns
  • (pandas.Timestamp): observation dates, column defined by covsirphy.GIS(date)

  • columns defined by @variables

Note

Note that records with NAs as country names will be always removed.

Note

When geo=None or geo=(None,), returns global scale records (total values of all country-level data), assuming we have country/province/city as layers here.

Note

When geo=(“Japan”,) or geo=”Japan”, returns country-level data in Japan.

Note

When geo=([“Japan”, “UK”],), returns country-level data of Japan and UK.

Note

When geo=(“Japan”, “Tokyo”), returns province-level data of Tokyo/Japan.

Note

When geo=(“Japan”, [“Tokyo”, “Kanagawa”]), returns total values of province-level data of Tokyo/Japan and Kanagawa/Japan.

Note

When geo=(“Japan”, “Kanagawa”, “Yokohama”), returns city-level data of Yokohama/Kanagawa/Japan.

Note

When geo=((“Japan”, “Kanagawa”, [“Yokohama”, “Kawasaki”]), returns total values of city-level data of Yokohama/Kanagawa/Japan and Kawasaki/Kanagawa/Japan.

to_geopandas(geo=None, on=None, variables=None, directory=None, natural_earth=None)[source]

Add geometry information with GeoJSON file of “Natural Earth” GitHub repository to data.

Parameters:
  • geo (tuple(list[str] or tuple(str) or str) or str or None) – location names to specify the layer or None (the top level)

  • on (str or None) – the date, like 22Jan2020, or None (the last date of each location)

  • variables (list[str] or None) – list of variables to add or None (all available columns)

  • directory (list[str] or tuple(str) or str) – top directory name(s) to save GeoJSON files or None (directory of this this script)

  • natural_earth (str or None) – title of GeoJSON file (without extension) of “Natural Earth” GitHub repository or None (automatically determined)

Raises:

ValueError – country layer is not included in the dataset

Returns:

geopandas.GeoDataFrame

Index:
  • reset index

Columns:
  • (str): layer focused on with @gis and GIS.layer()

  • (pandas.Timestamp): observation dates, column defined by covsirphy.GIS(date)

  • geometry: geometric information

Note

Regarding @geo argument, please refer to covsirphy.GIS.layer().

Note

GeoJSON files are listed in https://github.com/nvkelso/natural-earth-vector/tree/master/geojson https://www.naturalearthdata.com/ https://github.com/nvkelso/natural-earth-vector Natural Earth (Free vector and raster map data at naturalearthdata.com, Public Domain)