Usage: exploratory data analysis

Open In Colab

Here, we will review the datasets downloaded and cleaned with DataLoader class. Methods of this class produces the following class instances.

  1. JHUData: the number of confirmed/infected/fatal/recovered cases

  2. OxCGRTData: indicators of government responses (OxCGRT)

  3. PCRData: the number of tests

  4. VaccineData: the number of vaccinations, people vaccinated

  5. MobilityData: percentage to baseline in visits

  6. PyramidData: population pyramid

  7. JapanData: Japan-specific dataset

If you want to use a new dataset for your analysis, please kindly inform us with GitHub Issues: Request new method of DataLoader class.

Note:
LinelistData (linelist of case reports) was deprecated with issue #866 at development version 2.22.0.
Note:
PopulationData (population values) was deprecated with issue #904 at development version 2.22.0.

In this notebook, review the cleaned datasets one by one and visualize them.

Preparation

Import the packages. Please confirm that the latest version of covsirphy was installed.

!pip install --upgrade covsirphy
[1]:
# !pip install covsirphy --upgrade
from pprint import pprint
import covsirphy as cs
cs.__version__
/home/runner/.pyenv/3.9/lib/python3.9/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
[1]:
'2.24.0'

Data cleaning classes will be produced with methods of DataLoader class. Please specify the directory to save CSV files when creating DataLoader instance. The default value of directory is “input” and we will set “../input” here.

Note:
Please find the details of DataLoader at Usage: data loading.
[2]:
# Create DataLoader instance
loader = cs.DataLoader("../input")

Usage of methods will be explained in the following sections. If you want to download all datasets with copy & paste, please refer to Dataset preparation.

The number of cases (JHU style)

The main data for analysis is that of the number of cases. JHUData class created with DataLoader.jhu() method is for the number of confirmed/fatal/recovered cases. The number of infected cases will be calculated as “Confirmed - Recovered - Fatal” when data cleaning.

[3]:
# Create instance
jhu_data = loader.jhu()
Retrieving COVID-19 dataset in Japan from https://github.com/lisphilar/covid19-sir/data/japan
Retrieving datasets from COVID-19 Data Hub https://covid19datahub.io/
        Please set verbose=2 to see the detailed citation list.
Retrieving datasets from Our World In Data https://github.com/owid/covid-19-data/
Retrieving datasets from COVID-19 Open Data by Google Cloud Platform https://github.com/GoogleCloudPlatform/covid-19-open-data
[4]:
# Check type
type(jhu_data)
[4]:
covsirphy.cleaning.jhu_data.JHUData

JHUData.citation property shows the description of this dataset.

[5]:
print(jhu_data.citation)
Hirokazu Takaya (2020-2022), COVID-19 dataset in Japan, GitHub repository, https://github.com/lisphilar/covid19-sir/data/japan
(Secondary source) Guidotti, E., Ardia, D., (2020), "COVID-19 Data Hub", Journal of Open Source Software 5(51):2376, doi: 10.21105/joss.02376.

Detailed citation list is saved in DataLoader.covid19dh_citation property. This is not a property of JHUData. Because many links are included, the will not be shown in this tutorial.

[6]:
# Detailed citations (string)
# data_loader.covid19dh_citation

We can check the raw data with JHUData.raw property.

[7]:
jhu_data.raw.tail()
[7]:
Date ISO3 Country Province Confirmed Infected Fatal Recovered Population
1215023 2022-03-29 ZWE Zimbabwe - 246042.0 NaN 5439.0 82994.0 14439018.0
1215024 2022-03-30 ZWE Zimbabwe - 246182.0 NaN 5440.0 82994.0 14439018.0
1215025 2022-03-31 ZWE Zimbabwe - 246286.0 NaN 5444.0 82994.0 14439018.0
1215026 2022-04-01 ZWE Zimbabwe - 246414.0 NaN 5444.0 82994.0 14439018.0
1215027 2022-04-02 ZWE Zimbabwe - 246481.0 NaN 5446.0 82994.0 14439018.0

The cleaned dataset is here.

[8]:
jhu_data.cleaned().tail()
[8]:
Date ISO3 Country Province Confirmed Infected Fatal Recovered Population
1215023 2022-03-29 ZWE Zimbabwe - 246042 157609 5439 82994 14439018
1215024 2022-03-30 ZWE Zimbabwe - 246182 157748 5440 82994 14439018
1215025 2022-03-31 ZWE Zimbabwe - 246286 157848 5444 82994 14439018
1215026 2022-04-01 ZWE Zimbabwe - 246414 157976 5444 82994 14439018
1215027 2022-04-02 ZWE Zimbabwe - 246481 158041 5446 82994 14439018

As you noticed, they are returned as a Pandas dataframe. Because tails are the latest values, pandas.DataFrame.tail() was used for reviewing it.

Check the data types and memory usage as follows.

[9]:
jhu_data.cleaned().info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1215028 entries, 0 to 1215027
Data columns (total 9 columns):
 #   Column      Non-Null Count    Dtype
---  ------      --------------    -----
 0   Date        1215028 non-null  datetime64[ns]
 1   ISO3        1215028 non-null  category
 2   Country     1215028 non-null  category
 3   Province    1215028 non-null  category
 4   Confirmed   1215028 non-null  int64
 5   Infected    1215028 non-null  int64
 6   Fatal       1215028 non-null  int64
 7   Recovered   1215028 non-null  int64
 8   Population  1215028 non-null  int64
dtypes: category(3), datetime64[ns](1), int64(5)
memory usage: 62.6 MB

Note that date is pandas.datetime64, area names are pandas.Category and the number of cases is numpy.int64.

Total number of cases in all countries

JHUData.total() returns total number of cases in all countries. Fatality and recovery rate are added.

[10]:
total_df = jhu_data.total()
# Show the oldest data
display(total_df.loc[total_df["Confirmed"] > 0].head())
# Show the latest data
display(total_df.tail())
Confirmed Infected Fatal Recovered Fatal per Confirmed Recovered per Confirmed Fatal per (Fatal or Recovered)
Date
2020-01-02 1 0 0 1 0.0 1.000000 0.0
2020-01-04 1 1 0 0 0.0 0.000000 NaN
2020-01-05 3 1 0 2 0.0 0.666667 0.0
2020-01-06 4 4 0 0 0.0 0.000000 NaN
2020-01-07 4 4 0 0 0.0 0.000000 NaN
Confirmed Infected Fatal Recovered Fatal per Confirmed Recovered per Confirmed Fatal per (Fatal or Recovered)
Date
2022-03-30 484811803 296505631 6154772 182151400 0.012695 0.375716 0.032685
2022-03-31 485862779 297426789 6157997 182277993 0.012674 0.375164 0.032680
2022-04-01 443643146 269739781 5515864 168387501 0.012433 0.379556 0.031718
2022-04-02 431051825 262097660 5363447 163590718 0.012443 0.379515 0.031745
2022-04-03 44493901 19602403 327077 24564421 0.007351 0.552085 0.013140

The first case (registered in the dataset) was 07Jan2020. COVID-19 outbreak is still ongoing.

We can create line plots with covsirphy.line_plot() function.

[11]:
cs.line_plot(total_df[["Infected", "Fatal", "Recovered"]], "Total number of cases over time")
_images/usage_dataset_26_0.png

Statistics of fatality and recovery rate are here.

[12]:
total_df.loc[:, total_df.columns.str.contains("per")].describe().T
[12]:
count mean std min 25% 50% 75% max
Fatal per Confirmed 822.0 0.027162 0.013084 0.0 0.020560 0.021731 0.031178 0.065194
Recovered per Confirmed 822.0 0.468943 0.184317 0.0 0.395970 0.553999 0.583285 1.000000
Fatal per (Fatal or Recovered) 815.0 0.120675 0.222446 0.0 0.034402 0.036802 0.066674 1.000000

Subset for area

JHUData.subset() creates a subset for a specific area. We can select country name and province name. In this tutorial, “Japan” and “Tokyo in Japan” will be used. Please replace it with your country/province name.

Subset for a country:
We can use both of country names and ISO3 codes.
[13]:
# Specify contry name
df, complement = jhu_data.records("Japan")
# Or, specify ISO3 code
# df, complement = jhu_data.records("JPN")
# Show records
display(df.tail())
# Show details of complement
print(complement)
Date Confirmed Infected Fatal Recovered Susceptible
783 2022-03-30 6452108 415942 27913 6008253 120076992
784 2022-03-31 6504873 428780 28010 6048083 120024227
785 2022-04-01 6552920 441767 28097 6083056 119976180
786 2022-04-02 6606464 455864 28200 6122400 119922636
787 2022-04-03 6653841 460801 28248 6164792 119875259
monotonic increasing complemented confirmed data and
monotonic increasing complemented fatal data and
partially complemented recovered data

Complement of records was performed. The second returned value is the description of complement. Details will be explained later and we can skip complement with auto_complement=False argument. Or, use just use JHUData.subset() method when the second returned value (False because no complement) is un-necessary.

[14]:
# Skip complement
df, complement = jhu_data.records("Japan", auto_complement=False)
# Or,
# df = jhu_data.subset("Japan")
display(df.tail())
# Show complement (False because not complemented)
print(complement)
Date Confirmed Infected Fatal Recovered Susceptible
783 2022-03-30 6452108 415942 27913 6008253 120076992
784 2022-03-31 6504873 428780 28010 6048083 120024227
785 2022-04-01 6552920 441767 28097 6083056 119976180
786 2022-04-02 6606464 455864 28200 6122400 119922636
787 2022-04-03 6653841 460801 28248 6164792 119875259
False

Subset for a province (called “prefecture” in Japan):

[15]:
df, _ = jhu_data.records("Japan", province="Tokyo")
df.tail()
[15]:
Date Confirmed Infected Fatal Recovered Susceptible
742 2022-03-30 1242659 98205 4157 1140297 12700197
743 2022-03-31 1250651 101863 4169 1144619 12692205
744 2022-04-01 1258633 106345 4178 1148110 12684223
745 2022-04-02 1266028 106438 4182 1155408 12676828
746 2022-04-03 1273991 114392 4191 1155408 12668865

The list of countries can be checked with JHUdata.countries() as follows.

[16]:
pprint(jhu_data.countries(), compact=True)
['Afghanistan', 'Albania', 'Algeria', 'American Samoa', 'Andorra', 'Angola',
 'Anguilla', 'Antigua and Barbuda', 'Argentina', 'Armenia', 'Aruba',
 'Australia', 'Austria', 'Azerbaijan', 'Bahamas', 'Bahrain', 'Bangladesh',
 'Barbados', 'Belarus', 'Belgium', 'Belize', 'Benin', 'Bermuda', 'Bhutan',
 'Bolivia', 'Bonaire, Sint Eustatius and Saba', 'Bosnia and Herzegovina',
 'Botswana', 'Brazil', 'Brunei', 'Bulgaria', 'Burkina Faso', 'Burundi',
 'Cambodia', 'Cameroon', 'Canada', 'Cape Verde', 'Cayman Islands',
 'Central African Republic', 'Chad', 'Chile', 'China', 'Colombia', 'Comoros',
 'Cook Islands', 'Costa Rica', "Cote d'Ivoire", 'Croatia', 'Cuba', 'Curacao',
 'Cyprus', 'Czech Republic', 'Democratic Republic of the Congo', 'Denmark',
 'Djibouti', 'Dominica', 'Dominican Republic', 'Ecuador', 'Egypt',
 'El Salvador', 'Equatorial Guinea', 'Eritrea', 'Estonia', 'Ethiopia',
 'Falkland Islands (Malvinas)', 'Faroe Islands', 'Fiji', 'Finland', 'France',
 'French Guiana', 'French Polynesia', 'Gabon', 'Gambia', 'Georgia', 'Germany',
 'Ghana', 'Gibraltar', 'Greece', 'Greenland', 'Grenada', 'Guadeloupe', 'Guam',
 'Guatemala', 'Guinea', 'Guinea-Bissau', 'Guyana', 'Haiti', 'Holy See',
 'Honduras', 'Hong Kong', 'Hungary', 'Iceland', 'India', 'Indonesia', 'Iran',
 'Iraq', 'Ireland', 'Isle of Man', 'Israel', 'Italy', 'Jamaica', 'Japan',
 'Jordan', 'Kazakhstan', 'Kenya', 'Kiribati', 'Kosovo', 'Kuwait', 'Kyrgyzstan',
 'Laos', 'Latvia', 'Lebanon', 'Lesotho', 'Liberia', 'Libya', 'Liechtenstein',
 'Lithuania', 'Luxembourg', 'Madagascar', 'Malawi', 'Malaysia', 'Maldives',
 'Mali', 'Malta', 'Marshall Islands', 'Martinique', 'Mauritania', 'Mauritius',
 'Mayotte', 'Mexico', 'Micronesia', 'Moldova', 'Monaco', 'Mongolia',
 'Montenegro', 'Montserrat', 'Morocco', 'Mozambique', 'Myanmar', 'Namibia',
 'Nepal', 'Netherlands', 'New Caledonia', 'New Zealand', 'Nicaragua', 'Niger',
 'Nigeria', 'North Macedonia', 'Northern Mariana Islands', 'Norway', 'Oman',
 'Pakistan', 'Palau', 'Palestine', 'Panama', 'Papua New Guinea', 'Paraguay',
 'Peru', 'Philippines', 'Poland', 'Portugal', 'Puerto Rico', 'Qatar',
 'Republic of the Congo', 'Romania', 'Russia', 'Rwanda', 'Réunion',
 'Saint Helena, Ascension and Tristan da Cunha', 'Saint Kitts and Nevis',
 'Saint Lucia', 'Saint Vincent and the Grenadines', 'Samoa', 'San Marino',
 'Sao Tome and Principe', 'Saudi Arabia', 'Senegal', 'Serbia', 'Seychelles',
 'Sierra Leone', 'Singapore', 'Sint Maarten', 'Slovakia', 'Slovenia',
 'Solomon Islands', 'Somalia', 'South Africa', 'South Korea', 'South Sudan',
 'Spain', 'Sri Lanka', 'Sudan', 'Suriname', 'Swaziland', 'Sweden',
 'Switzerland', 'Syria', 'Taiwan', 'Tajikistan', 'Tanzania', 'Thailand',
 'Timor-Leste', 'Togo', 'Tonga', 'Trinidad and Tobago', 'Tunisia', 'Turkey',
 'Turks and Caicos Islands', 'Uganda', 'Ukraine', 'United Arab Emirates',
 'United Kingdom', 'United States', 'Uruguay', 'Uzbekistan', 'Vanuatu',
 'Venezuela', 'Vietnam', 'Virgin Islands, British', 'Virgin Islands, U.S.',
 'Wallis and Futuna', 'Yemen', 'Zambia', 'Zimbabwe']

Complement

JHUData.records() automatically complement the records, if necessary and auto_complement=True (default). Each area can have either none or one or multiple complements, depending on the records and their preprocessing analysis.

We can show the specific kind of complements that were applied to the records of each country with JHUData.show_complement() method. The possible kinds of complement for each country are the following:

  1. “Monotonic_confirmed/fatal/recovered” (monotonic increasing complement) Force the variable show monotonic increasing.

  2. “Full_recovered” (full complement of recovered data) Estimate the number of recovered cases using the value of estimated average recovery period.

  3. “Partial_recovered” (partial complement of recovered data) When recovered values are not updated for some days, extrapolate the values.

Note:
“Recovery period” will be discussed in the next subsection.

For JHUData.show_complement(), we can specify country names and province names.

[17]:
# Specify country name
jhu_data.show_complement(country="Japan")
# Or, specify country and province name
# jhu_data.show_complement(country="Japan", province="Tokyo")
[17]:
Country Province Monotonic_confirmed Monotonic_fatal Monotonic_recovered Full_recovered Partial_recovered
0 Japan - True True True False True

When list was applied was country argument, the all specified countries will be shown. If None, all registered countries will be used.

[18]:
# Specify country names
jhu_data.show_complement(country=["Greece", "Japan"])
# Or, apply None
# jhu_data.show_complement(country=None)
[18]:
Country Province Monotonic_confirmed Monotonic_fatal Monotonic_recovered Full_recovered Partial_recovered
0 Greece - False False False False True
1 Japan - True True True False True

If complement was performed incorrectly or you need new algorithms, kindly let us know via issue page.

Recovery period

We defined “recovery period” as yhe time period between case confirmation and recovery (as it is subjectively defined per country). With the global cases records, we estimate the average recovery period using JHUData.calculate_recovery_period().

[19]:
recovery_period = jhu_data.calculate_recovery_period()
print(f"Average recovery period: {recovery_period} [days]")
Average recovery period: 15 [days]

What we currently do is to calculate the difference between confirmed cases and fatal cases and try to match it to some recovered cases value in the future. We apply this method for every country that has valid recovery data and average the partial recovery periods in order to obtain a single (average) recovery period. During the calculations, we ignore time intervals that lead to very short (<7 days) or very long (>90 days) partial recovery periods, if these exist with high frequency (>50%) in the records. We have to assume temporarily invariable compartments for this analysis to extract an approximation of the average recovery period.

Alternatively, we had tried to use linelist of case reports to get precise value of recovery period (average of recovery date minus confirmation date for cases), but the number of records was too small.

Visualize the number of cases at a time point

We can visualize the number of cases with JHUData.map() method. When country is None, global map will be shown.

Global map with country level data:

[20]:
# Global map with country level data
jhu_data.map(country=None, variable="Infected")
# To set included/exclude some countries
# jhu_data.map(country=None, variable="Infected", included=["Japan"])
# jhu_data.map(country=None, variable="Infected", excluded=["Japan"])
# To change the date
# jhu_data.map(country=None, variable="Infected", date="01Oct2021")
_images/usage_dataset_50_0.png

Values can be retrieved with .layer() method.

[21]:
jhu_data.layer(country=None).tail()
[21]:
Date ISO3 Country Confirmed Infected Fatal Recovered Population
186122 2022-03-29 ZWE Zimbabwe 246042 157609 5439 82994 14439018
186123 2022-03-30 ZWE Zimbabwe 246182 157748 5440 82994 14439018
186124 2022-03-31 ZWE Zimbabwe 246286 157848 5444 82994 14439018
186125 2022-04-01 ZWE Zimbabwe 246414 157976 5444 82994 14439018
186126 2022-04-02 ZWE Zimbabwe 246481 158041 5446 82994 14439018

Country map with province level data:

[22]:
# Country map with province level data
jhu_data.map(country="Japan", variable="Infected")
# To set included/exclude some countries
# jhu_data.map(country="Japan", variable="Infected", included=["Tokyo"])
# jhu_data.map(country="Japan", variable="Infected", excluded=["Tokyo"])
# To change the date
# jhu_data.map(country="Japan", variable="Infected", date="01Oct2021")
_images/usage_dataset_54_0.png

Values are here.

[23]:
jhu_data.layer(country="Japan").tail()
[23]:
Date ISO3 Country Province Confirmed Infected Fatal Recovered Population
38848 2022-03-30 JPN Japan Yamanashi 22826 1653 63 21110 812056
38849 2022-03-31 JPN Japan Yamanashi 23127 1751 64 21312 812056
38850 2022-04-01 JPN Japan Yamanashi 23382 1793 64 21525 812056
38851 2022-04-02 JPN Japan Yamanashi 23382 1793 64 21525 812056
38852 2022-04-03 JPN Japan Yamanashi 24119 2536 64 21519 812056
Note for Japan:
Province “Entering” means the number of cases who were confirmed when entering Japan.

OxCGRT indicators

Government responses are tracked with Oxford Covid-19 Government Response Tracker (OxCGRT). Because government responses and activities of persons change the parameter values of SIR-derived models, this dataset is significant when we try to forecast the number of cases. OxCGRTData class will be created with DataLoader.oxcgrt() method.

OxCGRT indicators are

  • school_closing,

  • workplace_closing,

  • cancel_events,

  • gatherings_restrictions,

  • transport_closing,

  • stay_home_restrictions,

  • internal_movement_restrictions,

  • international_movement_restrictions,

  • information_campaigns,

  • testing_policy, and

  • contact_tracing.

[24]:
oxcgrt_data = loader.oxcgrt()
[25]:
type(oxcgrt_data)
[25]:
covsirphy.cleaning.oxcgrt.OxCGRTData

Because records will be retrieved via “COVID-19 Data Hub” as well as JHUData, data description and raw data is the same.

[26]:
# Description
print(oxcgrt_data.citation)
# Raw
# oxcgrt_data.raw.tail()

The cleaned dataset is here.

[27]:
oxcgrt_data.cleaned().tail()
[27]:
Date ISO3 Country Province School_closing Workplace_closing Cancel_events Gatherings_restrictions Transport_closing Stay_home_restrictions Internal_movement_restrictions International_movement_restrictions Information_campaigns Testing_policy Contact_tracing Stringency_index
1219264 2022-01-15 GRL Greenland Syddanmark 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 2.0 2.0 3.0 2.0 -38.89
1219265 2022-01-16 GRL Greenland Syddanmark 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 2.0 2.0 3.0 2.0 -38.89
1219266 2022-01-17 GRL Greenland Syddanmark 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 2.0 2.0 3.0 2.0 -38.89
1219267 2022-01-18 GRL Greenland Syddanmark 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 2.0 2.0 3.0 2.0 -38.89
1219268 2022-01-19 GRL Greenland Syddanmark 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 2.0 2.0 3.0 2.0 -38.89

Subset for area

PopulationData.subset() creates a subset for a specific area. We can select only country name. Note that province level data is not registered in OxCGRTData.

Subset for a country:
We can use both of country names and ISO3 codes.
[28]:
oxcgrt_data.subset("Japan").tail()
# Or, with ISO3 code
# oxcgrt_data.subset("JPN").tail()
[28]:
Date School_closing Workplace_closing Cancel_events Gatherings_restrictions Transport_closing Stay_home_restrictions Internal_movement_restrictions International_movement_restrictions Information_campaigns Testing_policy Contact_tracing Stringency_index
798 2022-03-30 1.0 1.0 1.0 1.0 -1.0 1.0 1.0 4.0 2.0 1.0 1.0 47.22
799 2022-03-31 1.0 1.0 1.0 1.0 -1.0 1.0 1.0 4.0 2.0 1.0 1.0 47.22
800 2022-04-01 1.0 1.0 1.0 1.0 -1.0 1.0 1.0 4.0 2.0 1.0 1.0 47.22
801 2022-04-02 1.0 1.0 1.0 1.0 -1.0 1.0 1.0 4.0 2.0 1.0 1.0 47.22
802 2022-04-03 1.0 1.0 1.0 1.0 -1.0 1.0 1.0 4.0 2.0 1.0 1.0 47.22

Visualize indicator values

We can visualize indicator values with .map() method. Arguments are the same as JHUData.map(), but country name cannot be specified.

[29]:
oxcgrt_data.map(variable="Stringency_index")
_images/usage_dataset_69_0.png

Values are here.

[30]:
oxcgrt_data.layer().tail()
[30]:
Date ISO3 Country School_closing Workplace_closing Cancel_events Gatherings_restrictions Transport_closing Stay_home_restrictions Internal_movement_restrictions International_movement_restrictions Information_campaigns Testing_policy Contact_tracing Stringency_index
186924 2022-03-29 GRL Greenland 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 1.0 2.0 3.0 2.0 13.89
186925 2022-03-30 GRL Greenland 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 1.0 2.0 3.0 2.0 13.89
186926 2022-03-31 GRL Greenland 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 1.0 2.0 3.0 2.0 13.89
186927 2022-04-01 GRL Greenland 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 1.0 2.0 3.0 2.0 13.89
186928 2022-04-02 GRL Greenland 1.0 2.0 1.0 2.0 1.0 1.0 -1.0 1.0 2.0 3.0 2.0 13.89

The number of tests

The number of tests is also key information to understand the situation. PCRData class will be created with DataLoader.pcr() method.

[31]:
pcr_data = loader.pcr()
[32]:
type(pcr_data)
[32]:
covsirphy.cleaning.pcr_data.PCRData

Because records will be retrieved via “COVID-19 Data Hub” as well as JHUData, data description and raw data is the same.

[33]:
# Description
print(pcr_data.citation)
# Raw
# pcr_data.raw.tail()
Hirokazu Takaya (2020-2022), COVID-19 dataset in Japan, GitHub repository, https://github.com/lisphilar/covid19-sir/data/japan
(Secondary source) Guidotti, E., Ardia, D., (2020), "COVID-19 Data Hub", Journal of Open Source Software 5(51):2376, doi: 10.21105/joss.02376.
Hasell, J., Mathieu, E., Beltekian, D. et al. A cross-country database of COVID-19 testing. Sci Data 7, 345 (2020). https://doi.org/10.1038/s41597-020-00688-8

The cleaned dataset is here.

[34]:
pcr_data.cleaned().tail()
[34]:
Date Country Province Tests Confirmed
1215023 2022-03-29 Zimbabwe - 2176708 246042
1215024 2022-03-30 Zimbabwe - 2176708 246182
1215025 2022-03-31 Zimbabwe - 2176708 246286
1215026 2022-04-01 Zimbabwe - 2185767 246414
1215027 2022-04-02 Zimbabwe - 2185767 246481

Subset for area

PCRData.subset() creates a subset for a specific area. We can select country name and province name.

Subset for a country:
We can use both of country names and ISO3 codes.
[35]:
pcr_data.subset("Japan").tail()
# Or, with ISO3 code
# pcr_data.subset("JPN").tail()
[35]:
Date Tests Tests_diff Confirmed
783 2022-03-30 43110998 126369 6452108
784 2022-03-31 43265925 154927 6504873
785 2022-04-01 43413502 147577 6552920
786 2022-04-02 43606340 192838 6606464
787 2022-04-03 43717117 110777 6653841

Positive rate

Under the assumption that all tests were PCR test, we can calculate the positive rate of PCR tests as “the number of confirmed cases per the number of tests” with PCRData.positive_rate() method.

[36]:
pcr_data.positive_rate("Japan").tail()
_images/usage_dataset_83_0.png
[36]:
Date ISO3 Country Province Tests Confirmed Tests_diff Confirmed_diff Test_positive_rate
782 2022-03-30 JPN Japan - 43110998 6452108 128746.000000 42699.571429 33.165746
783 2022-03-31 JPN Japan - 43265925 6504873 127702.285714 45008.142857 35.244587
784 2022-04-01 JPN Japan - 43413502 6552920 127498.142857 44863.000000 35.187179
785 2022-04-02 JPN Japan - 43606340 6606464 131789.428571 45622.428571 34.617669
786 2022-04-03 JPN Japan - 43717117 6653841 132673.142857 45669.571429 34.422620

Visualize the number of tests

We can visualize the number of tests with .map() method. When country is None, global map will be shown. Arguments are the same as JHUData, but variable name cannot be specified.

Country level data:

[37]:
pcr_data.map(country=None)
_images/usage_dataset_86_0.png

Values are here.

[38]:
pcr_data.layer(country=None).tail()
[38]:
Date ISO3 Country Tests Confirmed
98081 2022-03-29 ZWE Zimbabwe 2176708 246042
98082 2022-03-30 ZWE Zimbabwe 2176708 246182
98083 2022-03-31 ZWE Zimbabwe 2176708 246286
98084 2022-04-01 ZWE Zimbabwe 2185767 246414
98085 2022-04-02 ZWE Zimbabwe 2185767 246481

Province level data:

[39]:
pcr_data.map(country="Japan")
_images/usage_dataset_90_0.png

Values are here.

[40]:
pcr_data.layer(country="Japan").tail()
[40]:
Date ISO3 Country Province Tests Confirmed
35648 2022-03-30 JPN Japan Yamanashi 239637 22826
35649 2022-03-31 JPN Japan Yamanashi 239637 23127
35650 2022-04-01 JPN Japan Yamanashi 248105 23382
35651 2022-04-02 JPN Japan Yamanashi 248105 23382
35652 2022-04-03 JPN Japan Yamanashi 248105 24119

Vaccinations

Vaccinations is a key factor to end the outbreak as soon as possible. VaccineData class will be created with DataLoader.vaccine() method.

[41]:
vaccine_data = loader.vaccine()
[42]:
type(vaccine_data)
[42]:
covsirphy.cleaning.vaccine_data.VaccineData

Description is here.

[43]:
print(vaccine_data.citation)
Hasell, J., Mathieu, E., Beltekian, D. et al. A cross-country database of COVID-19 testing. Sci Data 7, 345 (2020). https://doi.org/10.1038/s41597-020-00688-8
Hirokazu Takaya (2020-2022), COVID-19 dataset in Japan, GitHub repository, https://github.com/lisphilar/covid19-sir/data/japan

Raw data is here.

[44]:
vaccine_data.raw.tail()
[44]:
Date ISO3 Country Province Product Vaccinations Vaccinations_boosters Vaccinated_once Vaccinated_full
1215021 2022-03-27 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 8845039.0 433129.0 4918147.0 3493763.0
1215022 2022-03-28 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 8934360.0 457434.0 4975433.0 3501493.0
1215023 2022-03-29 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9039729.0 476359.0 5053114.0 3510256.0
1215024 2022-03-30 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9202369.0 505132.0 5175175.0 3522062.0
1215025 2022-03-31 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9368822.0 539033.0 5297081.0 3532708.0

The next is the cleaned dataset.

[45]:
vaccine_data.cleaned().tail()
[45]:
Date ISO3 Country Province Product Vaccinations Vaccinations_boosters Vaccinated_once Vaccinated_full
173355 2022-03-30 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9202369.0 505132.0 5175175.0 3522062.0
173356 2022-03-31 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9368822.0 539033.0 5297081.0 3532708.0
173357 2022-04-01 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9368822.0 539033.0 5297081.0 3532708.0
173358 2022-04-02 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9368822.0 539033.0 5297081.0 3532708.0
173359 2022-04-03 ZWE Zimbabwe - Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9368822.0 539033.0 5297081.0 3532708.0

Note for variables

Definition of variables are as follows.

  • Vaccinations: cumulative number of vaccinations

  • Vaccinations_boosters: cumulative number of booster vaccinations

  • Vaccinated_once: cumulative number of people who received at least one vaccine dose

  • Vaccinated_full: cumulative number of people who received all doses prescribed by the protocol

Registered countries can be checked with VaccineData.countries() method.

[46]:
pprint(vaccine_data.countries(), compact=True)
['Afghanistan', 'Albania', 'Algeria', 'Andorra', 'Angola', 'Anguilla',
 'Antigua and Barbuda', 'Argentina', 'Armenia', 'Aruba', 'Australia', 'Austria',
 'Azerbaijan', 'Bahamas', 'Bahrain', 'Bangladesh', 'Barbados', 'Belarus',
 'Belgium', 'Belize', 'Benin', 'Bermuda', 'Bhutan', 'Bolivia',
 'Bonaire Sint Eustatius and Saba', 'Bosnia and Herzegovina', 'Botswana',
 'Brazil', 'British Virgin Islands', 'Brunei', 'Bulgaria', 'Burkina Faso',
 'Burundi', 'Cambodia', 'Cameroon', 'Canada', 'Cape Verde', 'Cayman Islands',
 'Central African Republic', 'Chad', 'Chile', 'China', 'Colombia', 'Comoros',
 'Cook Islands', 'Costa Rica', "Cote d'Ivoire", 'Croatia', 'Cuba', 'Curacao',
 'Cyprus', 'Czechia', 'Democratic Republic of Congo', 'Denmark', 'Djibouti',
 'Dominica', 'Dominican Republic', 'Ecuador', 'Egypt', 'El Salvador',
 'Equatorial Guinea', 'Estonia', 'Eswatini', 'Ethiopia', 'Faeroe Islands',
 'Falkland Islands', 'Fiji', 'Finland', 'France', 'French Polynesia', 'Gabon',
 'Gambia', 'Georgia', 'Germany', 'Ghana', 'Gibraltar', 'Greece', 'Greenland',
 'Grenada', 'Guatemala', 'Guernsey', 'Guinea', 'Guinea-Bissau', 'Guyana',
 'Haiti', 'Honduras', 'Hong Kong', 'Hungary', 'Iceland', 'India', 'Indonesia',
 'Iran', 'Iraq', 'Ireland', 'Isle of Man', 'Israel', 'Italy', 'Jamaica',
 'Japan', 'Jersey', 'Jordan', 'Kazakhstan', 'Kenya', 'Kiribati', 'Kuwait',
 'Kyrgyzstan', 'Laos', 'Latvia', 'Lebanon', 'Lesotho', 'Liberia', 'Libya',
 'Liechtenstein', 'Lithuania', 'Luxembourg', 'Macao', 'Madagascar', 'Malawi',
 'Malaysia', 'Maldives', 'Mali', 'Malta', 'Mauritania', 'Mauritius', 'Mexico',
 'Moldova', 'Monaco', 'Mongolia', 'Montenegro', 'Montserrat', 'Morocco',
 'Mozambique', 'Myanmar', 'Namibia', 'Nauru', 'Nepal', 'Netherlands',
 'New Caledonia', 'New Zealand', 'Nicaragua', 'Niger', 'Nigeria', 'Niue',
 'North Macedonia', 'Norway', 'Oman', 'Pakistan', 'Palestine', 'Panama',
 'Papua New Guinea', 'Paraguay', 'Peru', 'Philippines', 'Pitcairn', 'Poland',
 'Portugal', 'Qatar', 'Republic of the Congo', 'Romania', 'Russia', 'Rwanda',
 'Saint Helena', 'Saint Kitts and Nevis', 'Saint Lucia',
 'Saint Vincent and the Grenadines', 'Samoa', 'San Marino',
 'Sao Tome and Principe', 'Saudi Arabia', 'Senegal', 'Serbia', 'Seychelles',
 'Sierra Leone', 'Singapore', 'Sint Maarten (Dutch part)', 'Slovakia',
 'Slovenia', 'Solomon Islands', 'Somalia', 'South Africa', 'South Korea',
 'South Sudan', 'Spain', 'Sri Lanka', 'Sudan', 'Suriname', 'Sweden',
 'Switzerland', 'Syria', 'Taiwan', 'Tajikistan', 'Tanzania', 'Thailand',
 'Timor', 'Togo', 'Tokelau', 'Tonga', 'Trinidad and Tobago', 'Tunisia',
 'Turkey', 'Turkmenistan', 'Turks and Caicos Islands', 'Tuvalu', 'Uganda',
 'Ukraine', 'United Arab Emirates', 'United Kingdom', 'United States',
 'Uruguay', 'Uzbekistan', 'Vanuatu', 'Venezuela', 'Vietnam',
 'Wallis and Futuna', 'Yemen', 'Zambia', 'Zimbabwe']

Subset for area

VaccineData.subset() creates a subset for a specific area. We can select only country name. Note that province level data is not registered.

Subset for a country:
We can use both of country names and ISO3 codes.
[47]:
vaccine_data.subset("Japan").tail()
# Or, with ISO3 code
# vaccine_data.subset("JPN").tail()
[47]:
Date Vaccinations Vaccinations_boosters Vaccinated_once Vaccinated_full
1571 2022-04-01 298700671.0 52521088.0 137629998.0 108549585.0
1572 2022-04-02 298700671.0 52521088.0 137629998.0 108549585.0
1573 2022-04-02 298700671.0 52521088.0 137629998.0 108549585.0
1574 2022-04-03 298700671.0 52521088.0 137629998.0 108549585.0
1575 2022-04-03 298700671.0 52521088.0 137629998.0 108549585.0

Visualize the number of vaccinations

We can visualize the number of vaccinations and the other variables with .map() method. Arguments are the same as JHUData, but country name cannot be specified.

[48]:
vaccine_data.map()
_images/usage_dataset_109_0.png

Values are here.

[49]:
vaccine_data.layer().tail()
[49]:
Date ISO3 Country Product Vaccinations Vaccinations_boosters Vaccinated_once Vaccinated_full
171779 2022-03-30 ZWE Zimbabwe Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9202369.0 505132.0 5175175.0 3522062.0
171780 2022-03-31 ZWE Zimbabwe Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9368822.0 539033.0 5297081.0 3532708.0
171781 2022-04-01 ZWE Zimbabwe Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9368822.0 539033.0 5297081.0 3532708.0
171782 2022-04-02 ZWE Zimbabwe Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9368822.0 539033.0 5297081.0 3532708.0
171783 2022-04-03 ZWE Zimbabwe Oxford/AstraZeneca, Sinopharm/Beijing, Sinovac... 9368822.0 539033.0 5297081.0 3532708.0

Mobility

Levels of mobility is a key factor of \(\rho\) (effective contact rate) of SIR-derived ODE models. MobilityData class will be created with DataLoader.mobility() method.

[50]:
mobility_data = loader.mobility()
[51]:
type(mobility_data)
[51]:
covsirphy.cleaning.mobility_data.MobilityData

Description is here.

[52]:
print(mobility_data.citation)
O. Wahltinez and others (2020), COVID-19 Open-Data: curating a fine-grained, global-scale data repository for SARS-CoV-2,  Work in progress, https://goo.gle/covid-19-open-data

Raw data is here.

[53]:
mobility_data.raw.tail()
[53]:
Date ISO3 Country Province Mobility_grocery_and_pharmacy Mobility_parks Mobility_transit_stations Mobility_retail_and_recreation Mobility_residential Mobility_workplaces
1215020 2022-03-26 ZWE Zimbabwe - 209.0 277.0 194.0 199.0 109.0 186.0
1215021 2022-03-27 ZWE Zimbabwe - 219.0 314.0 202.0 213.0 112.0 208.0
1215022 2022-03-28 ZWE Zimbabwe - 197.0 283.0 201.0 192.0 112.0 177.0
1215023 2022-03-29 ZWE Zimbabwe - 206.0 282.0 195.0 195.0 112.0 177.0
1215024 2022-03-30 ZWE Zimbabwe - 210.0 284.0 206.0 198.0 112.0 177.0

The next is the cleaned dataset.

[54]:
mobility_data.cleaned().tail()
[54]:
Date ISO3 Country Province Mobility_grocery_and_pharmacy Mobility_parks Mobility_transit_stations Mobility_retail_and_recreation Mobility_residential Mobility_workplaces
1215020 2022-03-26 ZWE Zimbabwe - 209 277 194 199 109 186
1215021 2022-03-27 ZWE Zimbabwe - 219 314 202 213 112 208
1215022 2022-03-28 ZWE Zimbabwe - 197 283 201 192 112 177
1215023 2022-03-29 ZWE Zimbabwe - 206 282 195 195 112 177
1215024 2022-03-30 ZWE Zimbabwe - 210 284 206 198 112 177

Note for variables

Definition of variables are as follows.

  • Mobility_grocery_and_pharmacy (int): % to baseline in visits (grocery markets, pharmacies etc.)

  • Mobility_parks (int): % to baseline in visits (parks etc.)

  • Mobility_transit_stations (int): % to baseline in visits (public transport hubs etc.)

  • Mobility_retail_and_recreation (int): % to baseline in visits (restaurant, museums etc.)

  • Mobility_residential (int): % to baseline in visits (places of residence)

  • Mobility_workplaces (int): % to baseline in visits (places of work)

Registered countries can be checked with MobilityData.countries() method.

[55]:
pprint(mobility_data.countries(), compact=True)
['Afghanistan', 'Angola', 'Antigua and Barbuda', 'Argentina', 'Aruba',
 'Australia', 'Austria', 'Bahamas', 'Bahrain', 'Bangladesh', 'Barbados',
 'Belarus', 'Belgium', 'Belize', 'Benin', 'Bolivia', 'Bosnia and Herzegovina',
 'Botswana', 'Brazil', 'Bulgaria', 'Burkina Faso', 'Cambodia', 'Cameroon',
 'Canada', 'Cape Verde', 'Chile', 'Colombia', 'Costa Rica', "Cote d'Ivoire",
 'Croatia', 'Czech Republic', 'Denmark', 'Dominican Republic', 'Ecuador',
 'Egypt', 'El Salvador', 'Estonia', 'Fiji', 'Finland', 'France', 'Gabon',
 'Germany', 'Ghana', 'Greece', 'Guatemala', 'Haiti', 'Honduras', 'Hungary',
 'India', 'Indonesia', 'Iraq', 'Israel', 'Italy', 'Jamaica', 'Japan', 'Jordan',
 'Kazakhstan', 'Kenya', 'Kuwait', 'Kyrgyzstan', 'Laos', 'Latvia', 'Lebanon',
 'Libya', 'Lithuania', 'Luxembourg', 'Malaysia', 'Mali', 'Malta', 'Mauritius',
 'Mexico', 'Moldova', 'Mongolia', 'Morocco', 'Mozambique', 'Myanmar', 'Namibia',
 'Nepal', 'Netherlands', 'New Zealand', 'Nicaragua', 'Niger', 'Nigeria',
 'Norway', 'Oman', 'Pakistan', 'Panama', 'Paraguay', 'Peru', 'Philippines',
 'Poland', 'Portugal', 'Puerto Rico', 'Qatar', 'Romania', 'Russia', 'Rwanda',
 'Saudi Arabia', 'Senegal', 'Serbia', 'Singapore', 'Slovakia', 'Slovenia',
 'South Africa', 'South Korea', 'Spain', 'Sri Lanka', 'Sweden', 'Switzerland',
 'Taiwan', 'Tajikistan', 'Tanzania', 'Thailand', 'Togo', 'Trinidad and Tobago',
 'Turkey', 'Uganda', 'Ukraine', 'United Arab Emirates', 'United Kingdom',
 'United States of America', 'Uruguay', 'Venezuela', 'Vietnam', 'Yemen',
 'Zambia', 'Zimbabwe']

Subset for area

MobilityData.subset() creates a subset for a specific area (country/province).

Subset for a country: We can use both of country names and ISO3 codes.

[56]:
mobility_data.subset("Japan").tail()
# Or, with ISO3 code
# mobility_data.subset("JPN").tail()
[56]:
Date Mobility_grocery_and_pharmacy Mobility_parks Mobility_transit_stations Mobility_retail_and_recreation Mobility_residential Mobility_workplaces
770 2022-03-26 96 62 78 86 106 91
771 2022-03-27 105 108 86 95 102 96
772 2022-03-28 101 99 83 95 105 90
773 2022-03-29 102 100 81 97 105 89
774 2022-03-30 104 115 82 98 104 89

Visualize mobility data

We can visualize the levels of mobility with MobilityData.map() method. Arguments are the same as JHUData.

[57]:
mobility_data.map(country=None)
_images/usage_dataset_127_0.png

Values are here.

[58]:
mobility_data.layer().tail()
[58]:
Date ISO3 Country Mobility_grocery_and_pharmacy Mobility_parks Mobility_transit_stations Mobility_retail_and_recreation Mobility_residential Mobility_workplaces
96441 2022-03-26 ZWE Zimbabwe 209 277 194 199 109 186
96442 2022-03-27 ZWE Zimbabwe 219 314 202 213 112 208
96443 2022-03-28 ZWE Zimbabwe 197 283 201 192 112 177
96444 2022-03-29 ZWE Zimbabwe 206 282 195 195 112 177
96445 2022-03-30 ZWE Zimbabwe 210 284 206 198 112 177

Population pyramid

With population pyramid, we can divide the population to sub-groups. This will be useful when we analyse the meaning of parameters. For example, how many days go out is different between the sub-groups. PyramidData class will be created with DataLoader.pyramid() method.

[59]:
pyramid_data = loader.pyramid()
[60]:
type(pyramid_data)
[60]:
covsirphy.cleaning.pyramid.PopulationPyramidData

Description is here.

[61]:
print(pyramid_data.citation)
World Bank Group (2020), World Bank Open Data, https://data.worldbank.org/

Raw dataset is not registered. Subset will be retrieved when PyramidData.subset() was called.

[62]:
pyramid_data.subset("Japan").tail()
Retrieving population pyramid dataset (Japan) from https://data.worldbank.org/
[62]:
Age Population Per_total
113 118 262648 0.00225
114 119 262648 0.00225
115 120 262648 0.00225
116 121 262648 0.00225
117 122 262648 0.00225

“Per_total” is the proportion of the age group in the total population.

Japan-specific dataset

This includes the number of confirmed/infected/fatal/recovered/tests/moderate/severe cases at country/prefecture level and metadata of each prefecture (province). JapanData class will be created with DataLoader.japan() method.

[63]:
japan_data = loader.japan()
[64]:
type(japan_data)
[64]:
covsirphy.cleaning.japan_data.JapanData

Description is here.

[65]:
print(japan_data.citation)
Hirokazu Takaya (2020-2022), COVID-19 dataset in Japan, GitHub repository, https://github.com/lisphilar/covid19-sir/data/japan

The next is the cleaned dataset.

[66]:
japan_data.cleaned().tail()
[66]:
Date Country Province Confirmed Infected Fatal Recovered Tests Moderate Severe Vaccinations Vaccinations_boosters Vaccinated_once Vaccinated_full
36633 2022-04-02 Japan Nagasaki 33504 1499 116 31889 308815 3000 1 0 0 0 0
36634 2022-04-02 Japan Okinawa 124639 7667 441 116531 664154 7664 8 0 0 0 0
36635 2022-04-02 Japan Yamanashi 23382 1793 64 21525 248105 1798 1 0 0 0 0
36636 2022-04-03 Japan - 6653841 460801 28248 6164792 43717117 444917 510 58919234270 1800280086 32796114518 24322839666
36637 2022-04-03 Japan Entering 14266 1062 8 13196 1702306 1062 0 233882625393 41124011904 107764288434 84994325055

Visualize values

We can visualize the values with .map() method. Arguments are the same as JHUData.

[67]:
japan_data.map(variable="Severe")
_images/usage_dataset_146_0.png

Values are here.

[68]:
japan_data.layer(country="Japan").tail()
[68]:
Date Country Province Confirmed Infected Fatal Recovered Tests Moderate Severe Vaccinations Vaccinations_boosters Vaccinated_once Vaccinated_full
35845 2022-04-02 Japan Osaka 802145 37009 4714 760422 6224907 36696 291 0 0 0 0
35846 2022-04-02 Japan Nagasaki 33504 1499 116 31889 308815 3000 1 0 0 0 0
35847 2022-04-02 Japan Okinawa 124639 7667 441 116531 664154 7664 8 0 0 0 0
35848 2022-04-02 Japan Yamanashi 23382 1793 64 21525 248105 1798 1 0 0 0 0
35849 2022-04-03 Japan Entering 14266 1062 8 13196 1702306 1062 0 233882625393 41124011904 107764288434 84994325055

Map with country level data is not prepared, but country level data can be retrieved.

[69]:
japan_data.layer(country=None).tail()
[69]:
Date Country Confirmed Infected Fatal Recovered Tests Moderate Severe Vaccinations Vaccinations_boosters Vaccinated_once Vaccinated_full
783 2022-03-30 Japan 6452108 415942 27913 6008253 43110998 401231 655 57725021813 1590785961 32245594526 23888641326
784 2022-03-31 Japan 6504873 428780 28010 6048083 43265925 411013 627 58023132257 1642716822 32383224524 23997190911
785 2022-04-01 Japan 6552920 441767 28097 6083056 43413502 423151 533 58321832928 1695237910 32520854522 24105740496
786 2022-04-02 Japan 6606464 455864 28200 6122400 43606340 440031 518 58620533599 1747758998 32658484520 24214290081
787 2022-04-03 Japan 6653841 460801 28248 6164792 43717117 444917 510 58919234270 1800280086 32796114518 24322839666

Metadata

Additionally, JapanData.meta() retrieves meta data for Japan prefectures.

[70]:
japan_data.meta().tail()
Retrieving Metadata of Japan dataset from https://github.com/lisphilar/covid19-sir/data/japan
[70]:
Prefecture Admin_Capital Admin_Region Admin_Num Area_Habitable Area_Total Clinic_bed_Care Clinic_bed_Total Hospital_bed_Care Hospital_bed_Specific Hospital_bed_Total Hospital_bed_Tuberculosis Hospital_bed_Type-I Hospital_bed_Type-II Population_Female Population_Male Population_Total Location_Latitude Location_Longitude
42 Kumamoto Kumamoto Kyushu 43 2796 7409 497 4628 8340 0 33710 95 2 46 933 833 1765 32.790513 130.742388
43 Oita Oita Kyushu 44 1799 6341 269 3561 2618 0 19834 50 2 38 607 546 1152 33.238391 131.612658
44 Miyazaki Miyazaki Kyushu 45 1850 7735 206 2357 3682 0 18769 33 1 30 577 512 1089 31.911188 131.423873
45 Kagoshima Kagoshima Kyushu 46 3313 9187 652 4827 7750 0 32651 98 1 44 863 763 1626 31.560052 130.557745
46 Okinawa Naha Okinawa 47 1169 2281 83 914 3804 0 18710 47 4 20 734 709 1443 26.211761 127.681119