Scrape Google Maps Reviews with Python: 2025 Guide

If you want to scrape Google Maps reviews with Python, you're dealing with one of the trickier scraping targets on the web. Google loads reviews dynamically, rotates its HTML structure, and actively detects bots. This guide covers two working methods — Playwright and Selenium — with complete code, anti-detection techniques, and honest notes on what breaks and why.

No fluff. Just code that works.

What Is Google Reviews Scraping?

Google reviews scraping is the automated extraction of customer review data from Google Maps and Google Business listings. Instead of copying reviews manually, a script visits business pages and pulls the data for you.

Each review contains useful fields:

Star rating (1–5)
Review text
Reviewer name
Date posted
Business response (if any)

That data has real value. Reputation monitoring, competitive analysis, sentiment tracking, lead qualification — all of it starts with raw review data.

Why Not Use the Official Google API?

The Google Places API gives you reviews, but with strict limits. You get at most 5 reviews per business. No historical data. No competitor reviews. Pricing scales fast once you exceed the free tier.

Web scraping gives you access to all public reviews, with no artificial cap. The tradeoff: you have to handle Google's anti-bot systems yourself.

Why Python for This Task?

Python has the best ecosystem for browser automation and data extraction. Three libraries do most of the heavy lifting:

Playwright — modern, fast, async-ready, built-in stealth features
Selenium — battle-tested, massive community, maximum compatibility
BeautifulSoup — lightweight HTML parsing once you have the raw content

Google reviews load via JavaScript. Static scrapers (requests + BeautifulSoup alone) won't work here. You need a real browser that executes JS, scrolls the page, and clicks buttons — exactly what Playwright and Selenium do.

The Core Challenge: Why Google Fights Back

Before writing a single line of code, understand what you're up against.

Dynamic Content Loading

Google doesn't serve all reviews in the initial HTML. The first page load shows 10–20 reviews. More load as you scroll. Each batch triggers separate network requests. Your scraper must simulate scrolling to trigger those loads.

Bot Detection Layers

Google runs several detection systems simultaneously:

Browser fingerprinting — screen resolution, fonts, timezone, language
Behavioral analysis — mouse movement patterns, scroll speed, click timing
Request pattern recognition — non-human request frequency
IP reputation — flagging IPs that send too many requests

Hit any of these triggers and you'll see CAPTCHAs, empty results, or a full block.

Constantly Changing HTML Structure

Google updates its frontend regularly. A CSS selector that works today may return zero results next week. Robust scrapers use multiple fallback selectors for every field.

Method 1: Playwright (Recommended for 2025)

Playwright is the better starting point for new projects. It's 2–3x faster than Selenium, has built-in async support, and handles anti-detection with less manual configuration.

Setup

python -m venv google_scraper_env
source google_scraper_env/bin/activate  # Windows: google_scraper_env\Scripts\activate
pip install playwright pandas emoji beautifulsoup4 lxml
playwright install chromium

Complete Playwright Scraper

from playwright.sync_api import sync_playwright
import pandas as pd
import re
import emoji
import logging
import time
import random

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

class GoogleReviewsScraper:
    def __init__(self, headless=True):
        self.headless = headless
        self.reviews_data = []

    def clean_text(self, text):
        text = emoji.replace_emoji(text, replace='')
        text = re.sub(r'\s+', ' ', text).strip()
        return text

    def random_delay(self, min_delay=1, max_delay=3):
        time.sleep(random.uniform(min_delay, max_delay))

    def initialize_browser(self):
        playwright = sync_playwright().start()
        browser = playwright.chromium.launch(
            headless=self.headless,
            args=[
                '--disable-blink-features=AutomationControlled',
                '--disable-extensions',
                '--no-sandbox',
                '--disable-setuid-sandbox',
                '--disable-dev-shm-usage',
                '--disable-gpu'
            ]
        )
        context = browser.new_context(
            user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
            viewport={'width': 1366, 'height': 768}
        )
        page = context.new_page()
        page.add_init_script("""
            Object.defineProperty(navigator, 'webdriver', {
                get: () => undefined,
            });
        """)
        return playwright, browser, page

    def search_business(self, page, business_name):
        try:
            page.goto("https://www.google.com/maps", wait_until="networkidle")
            self.random_delay(2, 4)
            search_box = page.locator("input[id='searchboxinput']")
            search_box.fill(business_name)
            search_box.press("Enter")
            page.wait_for_timeout(5000)
            logger.info(f"Searched for: {business_name}")
            return True
        except Exception as e:
            logger.error(f"Error searching: {e}")
            return False

    def navigate_to_reviews(self, page):
        try:
            reviews_tab = page.get_by_role("tab", name=re.compile("Reviews|reviews", re.IGNORECASE))
            if reviews_tab.is_visible():
                reviews_tab.click()
                page.wait_for_timeout(3000)
                logger.info("Navigated to reviews section")
                return True
            logger.warning("Reviews tab not found")
            return False
        except Exception as e:
            logger.error(f"Error navigating to reviews: {e}")
            return False

    def scroll_and_load_reviews(self, page, max_reviews=100):
        loaded_reviews = 0
        scroll_attempts = 0
        max_scroll_attempts = 20

        while loaded_reviews < max_reviews and scroll_attempts < max_scroll_attempts:
            try:
                page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
                self.random_delay(2, 4)
                current_reviews = page.locator('[data-review-id]').count()
                if current_reviews > loaded_reviews:
                    loaded_reviews = current_reviews
                    logger.info(f"Loaded {loaded_reviews} reviews...")
                    scroll_attempts = 0
                else:
                    scroll_attempts += 1
                try:
                    more_button = page.locator("button", has_text=re.compile("more|More", re.IGNORECASE))
                    if more_button.is_visible():
                        more_button.click()
                        self.random_delay(2, 3)
                except:
                    pass
            except Exception as e:
                logger.error(f"Error during scrolling: {e}")
                break

        logger.info(f"Total reviews found: {loaded_reviews}")
        return loaded_reviews

    def extract_review_data(self, page):
        reviews = []
        try:
            review_elements = page.locator('[data-review-id]').all()
            for element in review_elements:
                try:
                    review_data = {}

                    name_element = element.locator('div[class*="name"] span, div[class*="Name"] span').first
                    review_data['reviewer_name'] = name_element.inner_text() if name_element.is_visible() else "Anonymous"

                    rating_element = element.locator('[role="img"][aria-label*="star"]').first
                    if rating_element.is_visible():
                        rating_text = rating_element.get_attribute('aria-label')
                        rating_match = re.search(r'(\d+)', rating_text)
                        review_data['rating'] = int(rating_match.group(1)) if rating_match else None

                    text_elements = element.locator('span[class*="review-text"], div[class*="review-text"]').all()
                    review_text = ""
                    for text_elem in text_elements:
                        if text_elem.is_visible():
                            review_text += text_elem.inner_text() + " "
                    review_data['review_text'] = self.clean_text(review_text.strip())

                    date_element = element.locator('span[class*="date"], div[class*="date"]').first
                    review_data['review_date'] = date_element.inner_text() if date_element.is_visible() else "Unknown"

                    if review_data['review_text']:
                        reviews.append(review_data)
                except Exception as e:
                    logger.warning(f"Error on individual review: {e}")
                    continue

            logger.info(f"Extracted {len(reviews)} reviews")
            return reviews
        except Exception as e:
            logger.error(f"Extraction error: {e}")
            return []

    def scrape_reviews(self, business_name, max_reviews=100):
        playwright, browser, page = self.initialize_browser()
        try:
            if not self.search_business(page, business_name):
                return []
            if not self.navigate_to_reviews(page):
                return []
            self.scroll_and_load_reviews(page, max_reviews)
            reviews = self.extract_review_data(page)
            self.reviews_data = reviews
            return reviews
        except Exception as e:
            logger.error(f"Scraping failed: {e}")
            return []
        finally:
            browser.close()
            playwright.stop()

    def save_to_csv(self, filename="google_reviews.csv"):
        if self.reviews_data:
            df = pd.DataFrame(self.reviews_data)
            df.to_csv(filename, index=False, encoding='utf-8')
            logger.info(f"Saved to {filename}")
        else:
            logger.warning("No reviews to save")

if __name__ == "__main__":
    scraper = GoogleReviewsScraper(headless=False)
    business_name = "Starbucks Times Square New York"
    reviews = scraper.scrape_reviews(business_name, max_reviews=50)
    if reviews:
        scraper.save_to_csv(f"reviews_{business_name.replace(' ', '_')}.csv")
        print(f"Scraped {len(reviews)} reviews.")
    else:
        print("No reviews scraped.")

What This Code Does

Stealth flags hide the automation fingerprint from Google's detection layer
Random delays between 1–4 seconds mimic human browsing rhythm
Scroll loop keeps loading until it hits max_reviews or runs out of content
Multiple fallback selectors handle Google's frequent HTML changes
CSV export gives you a clean file ready for analysis or import into any tool

Method 2: Selenium (Reliable Alternative)

Selenium has been the standard for browser automation for over a decade. It's slower than Playwright but has a larger community and more documentation.

When to Pick Selenium

You're working with legacy infrastructure that already uses it
You need maximum compatibility across browser versions
Your team has existing Selenium expertise

Setup

pip install selenium pandas

You'll also need ChromeDriver matching your Chrome version. Modern Selenium (4.6+) handles driver management automatically.

Complete Selenium Scraper

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.action_chains import ActionChains
from selenium.common.exceptions import TimeoutException, NoSuchElementException
import pandas as pd
import time
import random
import re
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class SeleniumGoogleReviewsScraper:
    def __init__(self, headless=True):
        self.headless = headless
        self.driver = None
        self.wait = None
        self.reviews_data = []

    def setup_driver(self):
        options = Options()
        if self.headless:
            options.add_argument("--headless")
        options.add_argument("--disable-blink-features=AutomationControlled")
        options.add_experimental_option("excludeSwitches", ["enable-automation"])
        options.add_experimental_option('useAutomationExtension', False)
        options.add_argument("--no-sandbox")
        options.add_argument("--disable-dev-shm-usage")
        options.add_argument("--window-size=1366,768")
        options.add_argument("--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36")

        self.driver = webdriver.Chrome(options=options)
        self.driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined,});")
        self.wait = WebDriverWait(self.driver, 20)
        logger.info("Driver initialized")

    def random_delay(self, min_s=1, max_s=3):
        time.sleep(random.uniform(min_s, max_s))

    def search_google_maps(self, business_name):
        try:
            self.driver.get("https://www.google.com/maps")
            self.random_delay(2, 4)
            search_box = self.wait.until(EC.presence_of_element_located((By.ID, "searchboxinput")))
            search_box.clear()
            for char in business_name:
                search_box.send_keys(char)
                time.sleep(random.uniform(0.05, 0.15))
            search_box.submit()
            self.wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "[data-value='Reviews']")))
            logger.info(f"Searched for: {business_name}")
            return True
        except TimeoutException:
            logger.error("Timeout on search")
            return False

    def click_reviews_tab(self):
        try:
            reviews_tab = self.wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "[data-value='Reviews']")))
            self.driver.execute_script("arguments[0].scrollIntoView(true);", reviews_tab)
            self.random_delay(1, 2)
            reviews_tab.click()
            self.wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "[data-review-id]")))
            logger.info("Reviews tab clicked")
            return True
        except TimeoutException:
            logger.error("Reviews tab not found")
            return False

    def scroll_to_load_reviews(self, target_reviews=100):
        last_height = self.driver.execute_script("return document.body.scrollHeight")
        reviews_loaded = 0
        scroll_attempts = 0

        while reviews_loaded < target_reviews and scroll_attempts < 30:
            self.driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
            self.random_delay(2, 4)
            try:
                show_more = self.driver.find_element(By.XPATH, "//button[contains(text(), 'more') or contains(text(), 'More')]")
                if show_more.is_displayed():
                    ActionChains(self.driver).move_to_element(show_more).click().perform()
                    self.random_delay(2, 3)
            except NoSuchElementException:
                pass

            current_count = len(self.driver.find_elements(By.CSS_SELECTOR, "[data-review-id]"))
            if current_count > reviews_loaded:
                reviews_loaded = current_count
                logger.info(f"Loaded {reviews_loaded} reviews...")
                scroll_attempts = 0
            else:
                scroll_attempts += 1

            new_height = self.driver.execute_script("return document.body.scrollHeight")
            if new_height == last_height:
                scroll_attempts += 1
            last_height = new_height

        return reviews_loaded

    def extract_reviews(self):
        reviews = []
        review_elements = self.driver.find_elements(By.CSS_SELECTOR, "[data-review-id]")
        for element in review_elements:
            try:
                review_data = {}
                try:
                    review_data['reviewer_name'] = element.find_element(By.CSS_SELECTOR, "div[class*='name'] span").text.strip()
                except NoSuchElementException:
                    review_data['reviewer_name'] = "Anonymous"
                try:
                    aria_label = element.find_element(By.CSS_SELECTOR, "[role='img'][aria-label*='star']").get_attribute('aria-label')
                    match = re.search(r'(\d+)', aria_label)
                    review_data['rating'] = int(match.group(1)) if match else None
                except NoSuchElementException:
                    review_data['rating'] = None
                try:
                    text_elems = element.find_elements(By.CSS_SELECTOR, "span[class*='review-text']")
                    review_data['review_text'] = " ".join([e.text for e in text_elems if e.text]).strip()
                except NoSuchElementException:
                    review_data['review_text'] = ""
                try:
                    review_data['review_date'] = element.find_element(By.CSS_SELECTOR, "span[class*='date']").text.strip()
                except NoSuchElementException:
                    review_data['review_date'] = "Unknown"

                if review_data['review_text']:
                    reviews.append(review_data)
            except Exception as e:
                logger.warning(f"Review extraction error: {e}")
                continue

        logger.info(f"Extracted {len(reviews)} reviews")
        return reviews

    def scrape_business_reviews(self, business_name, max_reviews=100):
        try:
            self.setup_driver()
            if not self.search_google_maps(business_name):
                return []
            if not self.click_reviews_tab():
                return []
            self.scroll_to_load_reviews(max_reviews)
            reviews = self.extract_reviews()
            self.reviews_data = reviews
            return reviews
        except Exception as e:
            logger.error(f"Scraping failed: {e}")
            return []
        finally:
            if self.driver:
                self.driver.quit()

    def save_to_csv(self, filename="selenium_reviews.csv"):
        if self.reviews_data:
            pd.DataFrame(self.reviews_data).to_csv(filename, index=False, encoding='utf-8')
            logger.info(f"Saved to {filename}")

if __name__ == "__main__":
    scraper = SeleniumGoogleReviewsScraper(headless=False)
    reviews = scraper.scrape_business_reviews("McDonald's Times Square", max_reviews=75)
    if reviews:
        scraper.save_to_csv("mcdonalds_times_square_reviews.csv")
        print(f"Scraped {len(reviews)} reviews.")

Anti-Detection Techniques That Actually Work

Both scrapers above include basic stealth. Here's what to add when you need to go further.

Proxy Rotation

Single-IP scraping gets blocked fast. Rotate proxies to distribute requests:

import random

PROXY_LIST = [
    "http://user:pass@proxy1:port",
    "http://user:pass@proxy2:port",
    "http://user:pass@proxy3:port",
]

def get_random_proxy():
    return random.choice(PROXY_LIST)

# In Playwright:
context = browser.new_context(proxy={"server": get_random_proxy()})

Residential proxies work better than datacenter proxies for Google specifically. Datacenter IPs get flagged faster.

User Agent Rotation

USER_AGENTS = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/120.0',
]

def get_random_ua():
    return random.choice(USER_AGENTS)

Human-Like Typing

def human_type(element, text):
    for char in text:
        element.send_keys(char)
        time.sleep(random.uniform(0.05, 0.2))

Typing at uniform speed is a bot signal. Variable delays per character look human.

Session Warm-Up

Don't go straight to Google Maps. Visit Google Search first, wait a few seconds, then navigate to Maps. Cold sessions that jump directly to scraping targets get flagged more often.

Handling Dynamic Content

Google reviews use infinite scroll — no page numbers, no "next" button. Your scraper needs to keep scrolling until either:

It hits your max_reviews target, or
No new reviews load after several scroll attempts

The code above handles this with a consecutive_failures counter. After 5 scrolls with no new reviews, it stops. That's the right approach — don't loop forever.

Expanding Truncated Reviews

Long reviews get cut off with a "More" link. To get full text:

def expand_truncated_reviews(page):
    expand_buttons = page.locator("button:has-text('More'), span:has-text('...')")
    count = expand_buttons.count()
    for i in range(min(count, 100)):
        btn = expand_buttons.nth(i)
        if btn.is_visible():
            btn.click()
            page.wait_for_timeout(300)
    logger.info(f"Expanded {count} truncated reviews")

Run this after loading all reviews, before extraction.

Legal and Ethical Considerations

Scraping publicly visible data is generally legal in most jurisdictions. The 2022 HiQ v. LinkedIn ruling in the US confirmed that scraping public data doesn't violate the Computer Fraud and Abuse Act.

That said, a few rules apply:

Don't overload servers. Keep requests under 10 per minute for casual use.
Respect robots.txt. Google's robots.txt restricts some paths — check it.
Don't republish scraped content verbatim. Aggregate and analyze, don't copy-paste.
Avoid personal data. Reviewer names are public, but don't build profiles on individuals.
Commercial use needs legal review. If you're selling scraped data, consult a lawyer.

The safest approach: scrape for internal analysis, not redistribution.

When Python Scraping Isn't the Right Tool

Writing and maintaining a Google Maps scraper takes real effort. Google changes its HTML structure regularly. Selectors break. Anti-bot measures evolve. You'll spend time debugging, not analyzing.

If you need Google Maps review data at scale — across hundreds or thousands of businesses — a pre-indexed database is faster and more reliable than a DIY scraper.

IBLead indexes 50M+ businesses across 37 countries, with up to 500 Google reviews per listing: full text, star rating, date, and reviewer name. The data is updated weekly and exports instantly to CSV. No scraping infrastructure to maintain, no proxies to manage, no selectors to fix when Google updates its frontend.

For one-off research on a handful of businesses, the Python approach in this guide works fine. For ongoing lead generation or reputation monitoring at scale, $52 for 10,000 leads is hard to beat.

Start free — 200 credits, no card required

FAQ

How many reviews can I scrape per day without getting blocked?

Start conservative: 100–500 reviews per day, across 5–10 businesses, with 2–3 second delays between actions. With proxy rotation and proper session management, you can push to 1,000–2,000 reviews per day. Aggressive scraping (5,000+ reviews/day) requires residential proxy networks and multiple browser sessions running in parallel.

Is Playwright or Selenium better for scraping Google Maps reviews with Python?

Playwright is the better choice for new projects in 2025. It's 2–3x faster, has built-in async support, and handles anti-detection with less manual configuration. Selenium is still valid if you have existing infrastructure or need maximum community support. Both methods work — the code in this guide demonstrates both.

Why are my selectors returning empty results?

Google updates its frontend regularly. A selector that worked last month may return nothing today. The fix: use multiple fallback selectors for each field, and test with headless=False so you can see what the page actually looks like. The extract_with_fallbacks() pattern shown in this guide handles this systematically.

Can I scrape Google reviews for competitor analysis?

Yes. Public review data is publicly accessible. Analyzing competitor sentiment, tracking rating trends, or identifying common complaints is a legitimate use case. Don't republish individual reviews verbatim or build personal profiles on reviewers. Focus on aggregate insights.

How do I handle CAPTCHAs?

Prevention beats solving. Slow down your request rate, use residential proxies, add realistic delays, and warm up sessions before scraping. When a CAPTCHA appears anyway: in development, run with headless=False and solve it manually. In production, either integrate a CAPTCHA-solving service or implement an exponential backoff that waits 5–10 minutes before retrying.

Scrape Google Maps Reviews with Python: 2025 Guide

What Is Google Reviews Scraping?

Why Not Use the Official Google API?

Why Python for This Task?

The Core Challenge: Why Google Fights Back

Dynamic Content Loading

Bot Detection Layers

Constantly Changing HTML Structure

Method 1: Playwright (Recommended for 2025)

Setup

Complete Playwright Scraper

What This Code Does

Method 2: Selenium (Reliable Alternative)

When to Pick Selenium

Setup

Complete Selenium Scraper

Anti-Detection Techniques That Actually Work

Proxy Rotation

User Agent Rotation

Human-Like Typing

Session Warm-Up

Handling Dynamic Content

Expanding Truncated Reviews

Legal and Ethical Considerations

When Python Scraping Isn't the Right Tool

FAQ

How many reviews can I scrape per day without getting blocked?

Is Playwright or Selenium better for scraping Google Maps reviews with Python?

Why are my selectors returning empty results?

Can I scrape Google reviews for competitor analysis?

How do I handle CAPTCHAs?

You may also like

Related articles

10 Proven Tips to Get Customers to Leave More Google Reviews on Maps

7 Cold Email Mistakes to Avoid: Examples & Templates

ABM Google Maps Data: The Complete Strategic Guide