DealMonitor Logo
Back to Blog
v0.12: HTTP-First Scraping and the End of Selenium Dependency

v0.12: HTTP-First Scraping and the End of Selenium Dependency

Β·by DealMonitor TeamΒ·5 min read
releasescrapingperformanceflights

For months, DealMonitor relied heavily on headless Chrome browsers to scrape prices from online shops. It worked, but it was slow, resource-hungry, and increasingly fragile as more shops deployed bot detection. With v0.12, we are fundamentally rethinking how we fetch prices. The new HTTP-first pipeline delivers results in milliseconds instead of seconds, and headless browsers only spin up when truly needed.

Why HTTP-First?

Running a headless Chrome instance for every price check is like driving a truck to buy a loaf of bread. Most shops serve their product pages as plain HTML with prices embedded directly in the markup or in structured data. A simple HTTP request is enough to get everything we need.

The problem was never the shops themselves. It was bot detection. Services like Cloudflare analyze incoming requests at the TLS layer, checking whether the connection looks like it comes from a real browser or from a script. A standard Python HTTP client fails this check instantly because its TLS fingerprint looks nothing like Chrome's.

That is where curl_cffi comes in.

Chrome TLS Fingerprint Impersonation

Every browser leaves a unique fingerprint during the TLS handshake: the cipher suites it supports, the order it presents them, the extensions it negotiates. Cloudflare and similar services maintain a database of known browser fingerprints. If your connection does not match one, you get blocked.

curl_cffi is a Python library that wraps libcurl with the ability to impersonate specific browser versions at the TLS level. When DealMonitor makes an HTTP request through curl_cffi, the connection looks indistinguishable from a real Chrome browser to the receiving server. The cipher suites, the ALPN protocols, the TLS extensions are all exactly what Chrome would send.

This means we can now fetch pages from Cloudflare-protected shops without launching a browser at all. The request completes in under a second instead of the 5-15 seconds a Selenium session would take.

The Pipeline in Action

Here is how the new scrape pipeline works for every price check:

  1. HTTP attempt first: We send a request via curl_cffi with a Chrome TLS fingerprint. If the shop returns a valid HTML page with detectable prices, we are done. No browser needed.
  2. Smart header rotation: Each request uses headers matched to the target shop. The User-Agent, Referer, and Accept-Language headers are rotated and localized for the shop's domain and country. A German shop gets German headers; a US shop gets American English.
  3. Selenium fallback: If the HTTP response indicates a challenge page or JavaScript-rendered content that we cannot parse statically, a headless Chrome session takes over in the background.

The result is dramatic. In our testing, roughly 70% of all shops now resolve via HTTP alone. That means faster results for you and significantly reduced server load for us.

Shop Scrape Mode Learning

The system remembers what works. After five consecutive successful HTTP-only scrapes for a given shop domain, that shop is automatically marked as HTTP-only. Future checks skip the Selenium fallback entirely, saving even more resources. If an HTTP-only shop later returns a challenge page, the system resets and tries Selenium again.

This adaptive behavior means the pipeline gets smarter over time without any manual configuration.

Flight Price Tracking

v0.12 also introduces something our users have been requesting since day one: flight price tracking. Airline and travel booking sites are among the most aggressively protected websites on the internet. Scraping them is practically impossible even with headless browsers. Instead, we took the API approach.

DealMonitor now integrates with the Amadeus and Kiwi.com flight APIs. Note: flight price tracking is planned for a future release. We are evaluating sustainable API options for reliable airline price data. When you add a URL from Expedia, Google Flights, Kayak, Skyscanner, or Momondo, we extract the route and date parameters and query the underlying API directly. This gives us reliable, real-time pricing without touching the website at all.

Flight prices are notoriously volatile, changing multiple times per day. With API-based tracking, we can check prices much more frequently than our standard 12-hour scrape cycle allows for regular shops.

JSON-LD Deduplication

A subtle but important fix in v0.12 addresses a long-standing issue with structured data. Many shops embed price information both in visible HTML elements and in JSON-LD structured data blocks. Our candidate detection was picking up both, sometimes leading to duplicate candidates with slightly different confidence scores.

The new pipeline deduplicates candidates: when a JSON-LD price matches an HTML candidate, the JSON-LD version takes priority. Structured data is machine-readable by design and far less prone to parsing errors than screen-scraped text. This improves detection accuracy across the board.

Code Quality Cleanup

Behind the scenes, we ran ruff across all 60 backend Python files to enforce PEP 8 compliance. Imports are sorted, whitespace is consistent, unused code is gone. This may not be user-facing, but a clean codebase is a reliable codebase. It also makes it easier for us to ship features faster with fewer bugs.

What This Means for You

The HTTP-first pipeline is the biggest architectural change since DealMonitor launched. Here is what you will notice:

  • Faster price checks: Most shops now respond in under a second instead of 5-15 seconds.
  • Better reliability: TLS fingerprinting bypasses many bot detection systems that previously blocked our checks.
  • Flight tracking: Add URLs from major travel sites and track flight prices alongside your regular product trackers.
  • Smarter detection: JSON-LD deduplication means fewer false positives and more accurate prices.

We are committed to making price tracking faster, more reliable, and more comprehensive with every release. If you have not tried DealMonitor yet, create your free account and see the difference the HTTP-first pipeline makes. And if you are already tracking prices, your existing trackers are already benefiting from these improvements.

Check out the full technical changelog on our changelog page.

Ready to Never Miss a Deal Again?

Start tracking prices in seconds. No credit card required.

Start for Free

Related Posts

Π˜ΠΌΠΏΠΎΡ€Ρ‚ΠΈΡ€ΡƒΠΉΡ‚Π΅ списки ΠΆΠ΅Π»Π°Π½ΠΈΠΉ β€” Steam ΠΈ Amazon Π² ΠΎΠ΄ΠΈΠ½ ΠΊΠ»ΠΈΠΊ

Π˜ΠΌΠΏΠΎΡ€Ρ‚ΠΈΡ€ΡƒΠΉΡ‚Π΅ списки ΠΆΠ΅Π»Π°Π½ΠΈΠΉ β€” Steam ΠΈ Amazon Π² ΠΎΠ΄ΠΈΠ½ ΠΊΠ»ΠΈΠΊ

4 ΠΌΠΈΠ½ чтСния

1 Π³ΠΎΠ΄ DealMonitor: ΠΎΡ‚ ΠΈΠ΄Π΅ΠΈ Π΄ΠΎ Ρ‚Ρ€Π΅ΠΊΠ΅Ρ€Π° Ρ†Π΅Π½

1 Π³ΠΎΠ΄ DealMonitor: ΠΎΡ‚ ΠΈΠ΄Π΅ΠΈ Π΄ΠΎ Ρ‚Ρ€Π΅ΠΊΠ΅Ρ€Π° Ρ†Π΅Π½

4 ΠΌΠΈΠ½. чтСния

Π‘ΠΌΠ΅Π½Π° Ρ€Π΅ΠΆΠΈΠΌΠ°: ΠΊΠ°ΠΊ CatBoost свСргла Π½Π°ΡˆΡƒ ΠΏΡ€Π΅Π΄Ρ‹Π΄ΡƒΡ‰ΡƒΡŽ модСль опрСдСлСния Ρ†Π΅Π½

Π‘ΠΌΠ΅Π½Π° Ρ€Π΅ΠΆΠΈΠΌΠ°: ΠΊΠ°ΠΊ CatBoost свСргла Π½Π°ΡˆΡƒ ΠΏΡ€Π΅Π΄Ρ‹Π΄ΡƒΡ‰ΡƒΡŽ модСль опрСдСлСния Ρ†Π΅Π½

5 ΠΌΠΈΠ½ чтСния

The 5 Best Price Comparison Tools in 2026 β€” Compared

The 5 Best Price Comparison Tools in 2026 β€” Compared

7 min read

Amazon Price History: How to Track Prices the Right Way

Amazon Price History: How to Track Prices the Right Way

6 min read

v0.11: API Integrations for Etsy, Game Stores, and Multi-Price Tracking

v0.11: API Integrations for Etsy, Game Stores, and Multi-Price Tracking

5 min read

Π£ΠΆΠ΅ Π΄ΡƒΠΌΠ°Π΅Ρ‚Π΅ ΠΎ ΠΏΠΎΠ΄Π°Ρ€ΠΊΠ°Ρ… Π½Π° Новый Π³ΠΎΠ΄? Π”Π°, Π² ΠΌΠ°Ρ€Ρ‚Π΅ β€” самоС врСмя.

Π£ΠΆΠ΅ Π΄ΡƒΠΌΠ°Π΅Ρ‚Π΅ ΠΎ ΠΏΠΎΠ΄Π°Ρ€ΠΊΠ°Ρ… Π½Π° Новый Π³ΠΎΠ΄? Π”Π°, Π² ΠΌΠ°Ρ€Ρ‚Π΅ β€” самоС врСмя.

5 ΠΌΠΈΠ½ чтСния

v0.10: Tackling Amazon and AliExpress with APIs

v0.10: Tackling Amazon and AliExpress with APIs

5 min read

Как ΠΈΠ½Ρ‚Π΅Ρ€Π½Π΅Ρ‚-ΠΌΠ°Π³Π°Π·ΠΈΠ½Ρ‹ ΠΎΠ±ΠΌΠ°Π½ΠΎΠΌ Π·Π°ΡΡ‚Π°Π²Π»ΡΡŽΡ‚ вас ΠΏΠΎΠΊΡƒΠΏΠ°Ρ‚ΡŒ β€” ΠΈ ΠΊΠ°ΠΊ Π΄Π°Ρ‚ΡŒ ΠΎΡ‚ΠΏΠΎΡ€

Как ΠΈΠ½Ρ‚Π΅Ρ€Π½Π΅Ρ‚-ΠΌΠ°Π³Π°Π·ΠΈΠ½Ρ‹ ΠΎΠ±ΠΌΠ°Π½ΠΎΠΌ Π·Π°ΡΡ‚Π°Π²Π»ΡΡŽΡ‚ вас ΠΏΠΎΠΊΡƒΠΏΠ°Ρ‚ΡŒ β€” ΠΈ ΠΊΠ°ΠΊ Π΄Π°Ρ‚ΡŒ ΠΎΡ‚ΠΏΠΎΡ€

8 ΠΌΠΈΠ½ чтСния

DealMonitor Π²Ρ‹Ρ…ΠΎΠ΄ΠΈΡ‚ Π² Π±Π΅Ρ‚Ρƒ: всё, Ρ‡Ρ‚ΠΎ Π½ΠΎΠ²ΠΎΠ³ΠΎ

DealMonitor Π²Ρ‹Ρ…ΠΎΠ΄ΠΈΡ‚ Π² Π±Π΅Ρ‚Ρƒ: всё, Ρ‡Ρ‚ΠΎ Π½ΠΎΠ²ΠΎΠ³ΠΎ

5 ΠΌΠΈΠ½ чтСния

ПолноС руководство ΠΏΠΎ ΡƒΠΌΠ½Ρ‹ΠΌ ΠΏΠΎΠΊΡƒΠΏΠΊΠ°ΠΌ Π² ΠΈΠ½Ρ‚Π΅Ρ€Π½Π΅Ρ‚Π΅

ПолноС руководство ΠΏΠΎ ΡƒΠΌΠ½Ρ‹ΠΌ ΠΏΠΎΠΊΡƒΠΏΠΊΠ°ΠΌ Π² ΠΈΠ½Ρ‚Π΅Ρ€Π½Π΅Ρ‚Π΅

7 ΠΌΠΈΠ½ чтСния

v0.8: Dark Mode, Error Monitoring, and Our First Blog Posts

v0.8: Dark Mode, Error Monitoring, and Our First Blog Posts

4 min read

v0.7: Web Push Notifications and Dashboard Search

v0.7: Web Push Notifications and Dashboard Search

5 min read

Как ИИ распознаёт Ρ†Π΅Π½Ρ‹ Π½Π° любом сайтС

Как ИИ распознаёт Ρ†Π΅Π½Ρ‹ Π½Π° любом сайтС

6 ΠΌΠΈΠ½ чтСния

v0.6: Telegram Notifications, Tracker Groups, and Sharing

v0.6: Telegram Notifications, Tracker Groups, and Sharing

4 min read

5 способов ΡΠΊΠΎΠ½ΠΎΠΌΠΈΡ‚ΡŒ с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ ΡƒΠ²Π΅Π΄ΠΎΠΌΠ»Π΅Π½ΠΈΠΉ ΠΎ Ρ†Π΅Π½Π°Ρ…

5 способов ΡΠΊΠΎΠ½ΠΎΠΌΠΈΡ‚ΡŒ с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ ΡƒΠ²Π΅Π΄ΠΎΠΌΠ»Π΅Π½ΠΈΠΉ ΠΎ Ρ†Π΅Π½Π°Ρ…

6 ΠΌΠΈΠ½ чтСния

v0.5: Google OAuth and 9 Languages from Day One

v0.5: Google OAuth and 9 Languages from Day One

4 min read

Как ΠΎΡ‚ΡΠ»Π΅ΠΆΠΈΠ²Π°Ρ‚ΡŒ Ρ†Π΅Π½Ρ‹ ΠΎΠ½Π»Π°ΠΉΠ½

Как ΠΎΡ‚ΡΠ»Π΅ΠΆΠΈΠ²Π°Ρ‚ΡŒ Ρ†Π΅Π½Ρ‹ ΠΎΠ½Π»Π°ΠΉΠ½

5 ΠΌΠΈΠ½ чтСния

v0.12: HTTP-First Scraping and the End of Selenium Dependency