# README

This archive contains code and datasets for [Triplet Censors: Demystifying Great Firewall’s DNS Censorship Behavior](https://www.usenix.org/conference/foci20/presentation/anonymous).

## Updates

As of August 11, 2020, we have released all our code and datasets to the maximum extend that does not harm our anonymity. These code and datasets support all major findings in our paper. We will continue anonymizing and releasing the remaining code and datasets with a goal to make our work highly reproducible within days. Please check back!

## Explanations

The largest and the most important dataset is `./all_more_fields.csv`, which is extracted from a set of pcap files across 9 months using:

```sh
bash -x extract_all_pcap_to_csv_more_fields.sh
```

For easier analysis on different injectors of the GFW, we categorize packets sent by different injectors into `injector1.csv`, `injector2.csv`, `injector3.csv`. To generate these files, you need to run one of the two following commands to yourself:

```sh
# This is the faster way
bash -x split_by_awk_new.sh
```

or

```sh
# This is a more readable but much slower way
python3 split_with.py all_more_fields.csv
```

## README on subdirectories

We categorized the code and datasets supporting different findings in the paper into different subdirectories.
There is a `README.md` file in each subdirectory, explaining what findings are supported by the code and dataset there.

For example, `./delay_differences/README.md` reads as:

```
The code and dataset under this directory were used to support the following findings in our work:
    "We also compare the time between sending our DNS query and when we receive the injected reply to get a sense of wherethe injectors are located. Specifically, we compare the delays of the three injectors and find that more than 90% of the time the delays are within 0.2 ms of each other. This would support the theory that these three devices are installed in the samephysical location."
```
