Skip to content
Dylan Jay edited this page Apr 19, 2021 · 55 revisions

Understanding Thailands Covid Positive Rate

This is the rabbit hole I went down to answer the question, what is thailands positive rate and how much testing is actually happening. The end result is a daily automated scrape of all the various sources of Covid data combined and downloadable for you convenience.

My conclusions are

  • There isn't a 100% reliable positive rate that can be used
  • Testing number reported in situation reports is actually just the PUI number not tests.
  • Only PCR testing data is available and cases have been confirmed in the past without PCR tests. It's unclear if this will continue to be the case. There is however an argument that proactive testing shouldn't be included in a positive rate as its not random.
  • Positive rate should tell you if enough testing is happening but if the sick aren't equally likely to get tested then it becomes less of a useful measure.
  • Thailand 2nd wave occurred in part because of groups that were less likely to get tested (Migrant workers)
  • Not all provinces have the same positive rate, esp over time.

PUI Counts

I'd long suspected PUI counts is a good proxy for testing numbers. PUI stands for Person Under Investigation and represents someone MOPH has determined is a high risk of having Covid. There is a formal criteria which is available and is a mix of symptoms and who you have had contact with. Then I started seeing Test numbers appear next to daily PUI numbers so I started by trying find an existing graph or data source for these daily reported numbers. MOPH apis don't report this. Just cases, deaths, hospitalisations and recoveries. Turns out that there is a daily situation report in PDF format with tables a cumulative PUI number among other data. There are Thai versions and English translations (delayed by a few days). I'm parsing both getting numbers back to April 2020.

It turns out the "Tests" number in the situation reports is unlikely to mean "Tests Performed" but rather a measure of the number people tested including those who did not meet PUI criteria. This number is not very useful as it seems to just be the daily PUI number with a couple of dumps additional numbers mid year (possibly everyone privately tested up to that point).

OurWorldInData

After this I discovered OurWorldInData was also graphing Thailand testing data. However at that time it was at least a month out of date.

Our World in Data Testing Graphs

After talking to OurWorldInData about if they could use the PUI numbers instead to get more up to date testing data they said that MOPH had made available an XLSX on their site previously that had actual testing data and this is what they preferred to use.

Note they are using the XLSX data which is turns out is only the public testing data. Private tests would include quarantine hotels and anyone paying for a test for fit to fly or because they don't qualify as a PUI.

A few days later they let me know the XLSX data was available again and updated.

MOPH Testing data

Next I parsed the XLSX with testing data from the MOPH shared folder. It contains a Pos number (presumably Positive test results) and a Total number (presumably total number of tests performed. In also also numbers for data that didn't have an assigned date up until the 3rd April. In the tests graph above I've included this data by distributing it relative to existing data before the 3rd of April.

Next I discovered there was additional testing information is a series of powerpoints (in both PPTX and PDF formats). This broke the tests down by health area and by private vs public. It also includes information of which hospitals performed the tests (not yet parsed). Due to missing files I ended up having to parse all these different formats.

Understanding the testing data.

The PUI's follow the testing numbers reasonably well (except for April and Jan which we will discuss later), but there are a lot more tests performed than PUI. Even during the period between the waves, PUI's numbers dropped but tests remained at about 8000 a day. In the graph above I've included the total tests (public+private) and just the private for comparison. The situation report "tests" number (blue) is mostly hidden as its the same as the PUI each day except for a couple of "catch-up" periods?

How many confirmed cases and positive tests?

You'd think this number would be the easiest to get and understand but there seems to be some big differences between positive tests and confirmed cases.

From the situation reports you can also get a breakdown of cases that helps us understand what is going on

From the test data by area we can see which areas have more cases than public positive test results.

April: Why so many more positive results than confirmed cases?

In the first wave (April) there seemed to be a lot more positive test results than confirmed cases. Even the number of private positive results was greater than confirmed cases. A single case could be tested multiple times so you would expect positive results to be larger than confirmed cases, however in April it was up 8 times higher (if including public and private results).

February: Why more cases than positive results?

During the second wave positive results and cases were closer up until early February. At this time there was a government initiative to do large amounts of proactive testing in factories with migrant workers. This resulted in a big jump in confirmed cases however it didn't seem to result in a similar jump in positive test results. For some reason these tests seem to be excluded from the testing data. At the same time there isn't a large jump in numbers of tests either.

There are few possibilities

  • Reports seem to indicate some cases are "historical" which could mean antibody tests were used which would be unusual.
    • The MOPH testing data seems to only be for PCR tests
    • There is news reports where the governor of Samut Sakhon refers to using blood tests to save money.
  • Antigen tests were used and these aren't included in the testing data
  • Pooled PCR tests with "group test" setup
  • PCR tests were used but for some reason not included.

At the moment my conclusion is that most likely antibody tests were used for large amounts of the proactive testing in SS.

This means a few things

  • the positive rate isn't accurate for feb
  • if antibody tests were used this would be unusual to confirm a case without an additional PCR test
  • if antibody tests were used the confirmed cases are not correct. Historical cases could have occurred months earlier
  • if antigen tests were used and not verified using PCR then this could have been done to save money. The positive rate is still inaccurate.
  • There is an argument that proactive testing should be excluded from a positive rate for example, as a positive rate is meant to represent a random sampling to show what the likelihood of finding more if you tested more. Proactive testing isn't random. It's generally done when you know there is a cluster and are expecting to find lots of cases in a specific location.
  • Our world in data exclude proactive testing for this reason

Is enough testing being done (Positive Rate)?

One way to work out if enough testing is being done is to measure positive rate, or the share of positive results of tests being performed. Since we also have an idea of the share of positive people compared to the people who were tested (at least for free/public) we can also compare this rate. This should answer the question "if we test more will we find more", because if we are currently testing and only finding 1 in 100 positive then testing more might not that much more.

WHO recommends a positivity rate of under 3% saying this is a sign the country is doing enough testing.

Since we aren't sure on the confirmed cases in April or the positive test data in Feb it makes it hard to know which positivity metric is more correct. However

  • The April rate is similar if you use confirmed cases/PUI or positive results/tests (what you are supposed to use?). It shows not enough testing was being done. This is not surprising given the test capacity was in the process of being ramped up like in most other countries.
  • The mid year positive rate is good. Testing was happening despite no cases. Even if you take out the private test data (which might include more ASQ tests?).
  • Mid dec we see a worse positive rate as the SS cluster emerged. But a lot worse if looking at confirmed cases/PUI - so possibly antibody tests were being used here too? Feb saw an even larger difference due to the use of antibody tests. What this means is you can't really rely on the positive rate from dec to show whats going on.
  • Positive rate doesn't tell the whole story. It assumes people are equally likely to be tested or that the most at risk are likely to be tested. Is testing equally spread out across the country? Migrant workers had perhaps had disincentives to not get tested (lack of insurance, illegal immigration status, not much money or time to go to the doctor, fear of losing income by being quarantined etc). There could be other groups who also have a disincentive.

Note OurWorldInData seems to be using both the public and private testing date to determine their positive rate for thailand

Where is the testing happening?

During the second wave there was a worry from some that testing was only happening in SS and known clusters. I discovered a source for that data so I've aggregated this over time.

This data is taken from the weekly summary of testing across the various Thailand health districts. The data is aggregated in date ranges so for this graph I've averaged the value across those dates. There is also one period of missing data in Nov. The data seems to match against the public testing data totals so it seems likely private tests are not included. For the labelling of the Thailand health districts I am unsure on District 13 found in the data as I couldn't find a definition for it. I've assumed its Bangkok but this could be wrong.

Where are the cases/positive results

I don't yet have a source for confirmed cases by area over time. This graph comes from the public MOPH testing data. It's not clear if this is where the people who tested positive live or just where the labs where the testing took place. It's possible some tests were sent to labs in different areas to be processed. The high number of positives in Bangkok during jan/feb suggest that some tests might have related to cases in other provinces?

As previously noted this seems to be missing positive results and tests from Feb due to the not included antibody data so this time period is inaccurate.

Is enough testing being done in each part of thailand?

Positive rate is calculated as pos/tests for each area and then scaled according to the total positive rate for that time period.

As previously noted this seems to be missing positive results and tests from Feb due to the not included antibody data so this time period is inaccurate.

Clone this wiki locally