You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In fetch/747 it was suggested that browsers should send Accept-Encoding: identity along with range requests, else some servers ignore the range and return a 200. So, this is a test!
data.json contains all the URLs in HTTP archive that end in mp3 or mp4, de-duped by host. Warning: These are media files from the internet, so many will be not safe for work.
Here's the query:
SELECT ANY_VALUE(url) as url FROM`httparchive.runs.2018_02_15_requests`WHERE REGEXP_CONTAINS (url, r'\.(mp4|mp3)$')
GROUP BY req_host
The test makes requests to those URLs with the following headers:
Range: bytes=0-
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36
It makes a request without an Accept-Encoding header, then makes additional request for each of these headers:
And records the status code & Content-Encoding header for each response, or {err: true} if the request timed out or failed.
The results are in results/out.json, but I recommend the interactive html results (depends on modern features like transform streams, so may only work in Chrome).
Note: The data is ndjson. Also, some servers may be unstable and return different results each time.
# Generate results/out.json from data.json
node index.js
# Print the results for a particular URL
node test-url.js https://www.narita-airport.jp/files/bg.mp4
# Print results where the return is different depending on Accept-Encoding
node log-interesting-results.js