Conversation
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
|
@max-ostapenko what's the latest with this? It's still marked as draft. |
…ove outdated scripts and update requirements
…ies and improve clarity
…bution and technologies analysis
There was a problem hiding this comment.
Pull request overview
This pull request implements Privacy 2025 queries for the HTTP Archive almanac, covering tracking technologies, cookie analysis, privacy metrics, and compliance frameworks. The PR adds multiple new SQL queries to analyze privacy-related data from the July 2025 crawl and updates supporting Python utilities for data processing and export to Google Sheets.
Changes:
- Added 19 new SQL queries for privacy analysis (trackers, cookies, IAB frameworks, referrer policies, etc.)
- Updated utility scripts for WhoTracksMe data and Have I Been Pwned breach data
- Enhanced the BigQuery-to-Sheets notebook with improved error handling and local development support
- Added new Python dependencies for data processing (tabulate, gspread, ipykernel, db-dtypes)
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/requirements.txt | Added dependencies for Jupyter notebook and Google Sheets integration |
| sql/util/whotracksme_trackers.py | Updated date to 2025-07-01 for new crawl data |
| sql/util/haveibeenpwned.py | Refactored breach data retrieval with updated schema and TRUNCATE mode |
| sql/util/bq_writer.py | Removed CSV source format specification from BigQuery load config |
| sql/util/bq_to_sheets.ipynb | Major refactor with improved Colab compatibility and error handling |
| sql/2025/privacy/*.sql | 19 new SQL queries analyzing trackers, cookies, privacy frameworks, and compliance |
| sql/2024/privacy/number_of_websites_with_related_origin_trials.sql | Refactored origin trial parsing function |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@tunetheweb how does it look? |
tunetheweb
left a comment
There was a problem hiding this comment.
LGTM on the SQL from a (very!) quick look but one question on the requirements.txt changes.
Makes progress on #4083
Tracking and Technologies
Cookie Analysis
Other Privacy Metrics
Privacy Compliance Frameworks