Techabo mark

Cookie Compliance Validation

A compliance scanner flagged 900+ items. Most were false positives. We built a tool to separate signal from noise.

Marcus PennyJanuary 15, 20255 min read
Cookie Compliance Validation

The Problem

A compliance scanner flagged over 900 cookies and scripts across a client's website. The scanner does what it's designed to do: surface everything it can detect, including historical, conditional, and proxy-loaded scripts.

The result was a wall of items requiring manual validation.

236
Cookies discovered
682
Scripts discovered
900+
Total items to review

Most were phantom discoveries. Scripts that appeared in the scan but never actually executed at runtime. The scanner vendor confirmed this is expected behavior. Their tool surfaces everything. Humans figure out what's real.

Manually validating each item would have taken days. Copy the name, search for documentation, determine the vendor, assign a category, decide if it's necessary. Repeat 900 times.

No one had the time. The task sat incomplete.


What We Built

We built an internal validation tool that focuses only on what actually runs.

1

Crawl the site

A headless browser navigates the site like a real user, visiting pages and interacting with elements.

2

Capture runtime data

Record every cookie set and every script that actually loads during the session. Not what might load. What does load.

3

Store and compare

Save the results to a local database. Cross-reference against the compliance scanner's output.

4

Surface the delta

Identify which items from the scanner actually appeared at runtime. The rest are noise.

Runtime scanner crawling pages and capturing actual cookie and script execution
The scanner reduced 900+ items to roughly 80 that needed real attention.

Adding AI Classification

Eighty items is better than 900, but still tedious. Each one requires research: who's the vendor, what category, what's the purpose, is it necessary?

AI classification workflow analyzing cookie metadata and returning vendor and category information

We added AI-assisted classification.

The tool sends each cookie and script's metadata to an AI model. It returns:

  • Vendor identification
  • Category (functional, analytics, marketing, etc.)
  • Purpose description
  • Confidence rating

For items where the AI was uncertain, we added a second opinion workflow. Send the same item to a second model, compare outputs, generate a consensus classification.

This eliminated hours of manual research.


The Outcome

Before

  • ~900 items to validate
  • Days of tedious research
  • High error potential
  • Task abandoned incomplete

After

  • 80 items requiring review
  • Hours instead of days
  • Defensible classifications
  • Repeatable process

The next compliance scan returned a small, accurate list. The phantom discoveries were mapped and excluded. The team could focus on items that actually mattered.


Why This Exists Now

A year ago, no one would have funded this build. The compliance task was annoying but survivable. The ROI on custom tooling didn't justify the engineering time.

AI changed the math. The runtime scanner took a few days to build. The classification layer took less. The total investment was a fraction of what manual validation would have cost.

This used to be a "wouldn't it be nice" conversation. Now it's a tool that was built within a week by a digital operator.


This use case demonstrates how AI and lightweight tooling can transform compliance workflows that would otherwise require days of manual effort.