Data offered here is offered as is with no guarantees. As much as possible government reports
and data feeds have been used effort has gone into making this data collection accurate and timely.
This sites only intention is to give an accurate representation of all the available Covid data for Thailand in one place.
Links to all data sources are including in Downloads
Proactive tests are normally done at specific high risk locations or places of known cases, rather than random sampling (but it’s possible random sampling may also be included).
Note: SS Cluster is classified as “Work”, but some other market clusters are classified as “Community”. This is because there isn’t enough data to separate out SS cluster cases
between those from factories and those from the market. This could change later.
Risk is most likely determined as part of the PUI criteria process?
NOTE Walkin Cases/3x PUI seems to give an estimate of positive rate (when cases are high), so it is included for when testing data is delayed. Note it is not the actual positive rate.
Positive rate is little like fishing in a lake. If you get few nibbles each time you put your line in you can guess that there is few fish in the lake. Less positives per test, less infections likely in the population.
WHO considers enough testing is happening if positive rate is under %5 rather than tests per population but only if 0.1% of the population is being tested per week (avg 7k tests per day for Thailand). Note this recommendation works best if everyone who might have COVID-19 is equally likely to get tested and there are reasons why this might not be the case in Thailand.
It’s likely Thailand excludes some test data so there could be more tests than this data shows. Excluding
proactive tests from positive rate is perhaps better for comparison with other countries they are less random and more likely to be positive as its testing known clusters.
Rapid antigen tests are not included in the test data, or in confirmed case numbers (unless they also had a positive PCR test). This is similar to most countries however some like UK count antigen tests in both tests and confirmed cases.
As the different sources of the data has increased so has the code needed fetch, extract and
display this data. All the code is fairly simple python however. It is a fun way to learn scraping
data and/or pandas and matplotlib.
Find a github issue and have a go. Many are marked as suitable for beginners
making new plots
improve existing plots
adding tests so it’s faster to make future fixes
improving scrapers that miss past data, e.g. vaccination reports
To run the tests (will only get files needed for tests)
bin/pytest
To add a test
Only add test data for dates where the format changed and so the scraper had to get updated. See commit history for dates where this happened or use code coverage.
Logs from a full scrape can be used to also identify files/dates that are not scraped correctly
if you are trying to add in past regression tests you can also use git blame covid_data.py on the scraping function to see the dates that lines were added or changed. in some cases comments indicated important dates where code had to change.
Add empty file in tests/scraper_type/dl_name.json
for some tests can be use date of file instead or filename.date.json (the date is ignored but helps for readability)
Run tests. This will download just the document needed for that test, scrape it and compare the results against the json.
of course this will fail but you can look at the generated data and compare it to the original file or other sources to make sure it looks right
If the results are correct there is commented out code in the test function to export the data to the test json file.
if you are using vscode to run pytests you need to refresh the tests list at this point for some reason
Note that not all scrapers have a test framework setup yet. But follow the existing code to add one or ask for help.
Running just plots (or latest files)
To get latest files; change into the root directory of your clone of the repository and then:
When debugging, to scrape just one part first, rearrange the lines in covid_data.py/scrape_and_combine so that the scraping function you want to debug gets called before the others do
Running full code (warning will take a long time)
You can just use the test framework without a full download if you want to work on scraping.
to download only the files that interest you first, you can comment out or rearrange the lines in covid_data.scrape_and_combine
to work on plots you can download the csv files from the website into the api directory and set env MAX_DAYS=0
To run the full scrape (warning this will take a long time as it downloads all the documents into a local cache)