Amazon is the world's largest online marketplace, and it is also one of the hardest websites to scrape reliably. Prices change constantly, product pages are loaded with dynamic content, and Amazon's bot detection is relentless. With v0.10, we are tackling the Amazon challenge head-on with a dedicated API integration, along with a new AliExpress API and a series of price detection fixes that improve accuracy across the board.
The Amazon Scraping Challenge
If you have ever tried to automate anything on Amazon, you know the pain. Product pages are not simple HTML documents. They are complex JavaScript applications that render differently based on your location, browsing history, and dozens of other signals. A price that shows as EUR 29.99 in one session might show as USD 33.50 in another, depending on which cookies are set.
But the real problem is deeper than that. Amazon product pages contain dozens of numbers that look like prices but are not: shipping costs, ratings, seller counts, review scores, variant prices for different sizes or colors, crossed-out "was" prices, and price fragments split across multiple HTML elements. Our ML model had to navigate all of this, and while it did a good job, there were persistent edge cases that no amount of training data could fully resolve.
Amazon Creators API
Amazon's deprecated Product Advertising API (PA-API v5) was the old standard for accessing product data programmatically. Its replacement, the Creators API, uses modern OAuth 2.0 authentication and provides structured product data including current prices, availability, and product metadata.
When you add an Amazon product URL to DealMonitor, we extract the ASIN (Amazon Standard Identification Number) from the URL and query the Creators API directly. The response includes the exact current price in the marketplace's native currency. No scraping, no bot detection, no ambiguity.
The API supports multiple Amazon regions: US, EU, and JP. Each region has its own authentication endpoint, and DealMonitor automatically routes requests to the correct regional API based on the Amazon domain in your URL (amazon.de, amazon.com, amazon.co.jp, etc.).
AliExpress Affiliate API
AliExpress presents similar challenges to Amazon: heavy JavaScript rendering, aggressive bot protection, and prices that vary by region and user session. We integrated the AliExpress affiliate API to get reliable price data directly from their system.
Like the Amazon integration, this works transparently. Add an AliExpress URL, and DealMonitor handles the rest. The API returns the current price, promotional price if applicable, and currency information. No browser session required.
Currency Forcing for International Shops
One of the most frustrating issues our users reported was inconsistent currencies on Amazon. You track a product on amazon.de expecting EUR prices, but the scraper session picks up USD or GBP because of how Amazon handles international visitors.
We solved this with i18n-prefs cookie forcing. When DealMonitor scrapes any Amazon domain, it sets the i18n-prefs cookie to the marketplace's native currency before making the request. Amazon.de always returns EUR, amazon.com always returns USD, amazon.co.uk always returns GBP. No more currency confusion.
This fix applies to both our HTTP scraping pipeline and Selenium sessions, ensuring consistent results regardless of server location or IP geolocation.
Steam Age Gate Bypass
Steam is one of the most popular platforms for PC gaming, and many of its product pages are behind an age verification gate. Before you can see the price of a mature-rated game, Steam asks you to confirm your age. For automated price checks, this was a blocker.
We now set a birthtime cookie before accessing Steam pages, which bypasses the age gate automatically. Combined with our IsThereAnyDeal integration (added in v0.11), DealMonitor now handles game price tracking comprehensively.
Price Fragment Filtering
Amazon displays prices in a split format: the whole number and the decimal part are in separate HTML elements (a-price-whole and a-price-fraction). Our candidate extractor was sometimes picking up these fragments as individual price candidates, leading to incorrect detections like "29" instead of "29.99".
v0.10 adds explicit filtering for these known fragment patterns. When we detect a price-whole or price-fraction element, we combine them before they enter the candidate pipeline. This eliminates an entire class of detection errors on Amazon pages.
CSS Class Subset Matching
When you add a tracker via the browser extension, we capture the CSS classes of the price element to identify it on future scrapes. But here is the catch: the same element can have different CSS classes depending on how the page is loaded. A browser extension sees the fully rendered DOM with all JavaScript-added classes, while our scraper might see a simpler version.
We replaced exact class matching with subset matching. If the extension captured classes "price main-price sale-price" and our scraper sees "price main-price", that is still a match. This seemingly small change dramatically improved tracker reliability across shops where JavaScript dynamically adds or removes CSS classes.
Non-Price Tag Filtering
Our candidate extractor was occasionally picking up numbers from form elements: dropdowns, quantity inputs, and configuration selectors. These elements often contain numbers that look like prices but represent quantities, sizes, or option indices.
We now exclude option, select, input, and textarea elements from candidate detection. Prices belong in display elements, not form controls.
Split Price Fix
Some shops display cents as a superscript next to the main price: "11.99". Our extractor was sometimes losing the decimal separator when combining these elements, producing "1199" instead of "11.99". We fixed the combination logic to correctly handle trailing dots and commas before superscript cent values.
Infrastructure Improvements
Under the hood, we replaced undetected-chromedriver with standard Selenium plus stealth JavaScript. This gives us a more maintainable and reliable browser automation stack. We also added CI quality gates: every code change now passes linting checks, build verification, and smoke import tests before it can be merged.
What This Means for You
- Amazon tracking is reliable: API integration means accurate prices in the correct currency, every time.
- AliExpress tracking works: No more blocked scrapes on one of the world's largest marketplaces.
- Better price detection: Fragment filtering, split price fixes, and tag exclusions reduce false positives.
- More robust trackers: CSS subset matching means your trackers survive DOM changes between page loads.
These improvements are already live for all users. Log in to your dashboard to see the difference, or check our changelog for the complete list of changes.
