SXStudio

System Design Reading Notes 15: Design Proximity Service

This is my reading notes for Chapter 1 in book “System Design Interview – An insider’s guide (Vol. 2)”.

Overview of Proximity Service

Proximity Service

A Proximity Service finds nearby locations (restaurants, hotels, theaters) based on a user’s geographical data, such as Yelp’s search for nearby restaurants or Google Maps locating gas stations. This is central to many location-based services (LBS) that rely on geospatial indexing and real-time location data to provide personalized search results.

Key Technologies Involved:


Key Concepts of a Proximity Service

  1. Search Based on Location:
    • The core functionality is to find businesses based on latitude, longitude, and an optional search radius.
    • Relevant Knowledge:
      • Distance Calculation: Haversine formula is used to calculate the shortest distance between two points on the Earth’s surface.
      • Proximity Search: In many systems, users can refine searches using filters like time (e.g., open now), price range, and reviews.
      • Real-Time Search Optimization: Caching frequent queries or applying rate limits for high-traffic times.
    • Example: A user opens Google Maps to find the nearest gas station within 5 km.
  2. Business Operations (CRUD):
    • The system allows business owners to create, update, delete, and retrieve business listings. Updates do not need to be immediate.
    • Relevant Knowledge:
      • Database Write Optimizations: Often, a delay in reflecting changes is used to reduce write loads, especially in eventual consistency models.
      • CAP Theorem: Consistency vs. availability trade-offs can occur when prioritizing real-time data updates (consistency) versus ensuring uptime (availability).
    • Example: A restaurant owner updates their opening hours or special deals, but the changes become visible the next day.
  3. Viewing Business Details:
    • Once the search results are returned, users can view more detailed information such as reviews, ratings, images, and operating hours.
    • Relevant Knowledge:
      • Microservices Architecture: Business information retrieval can be separated into smaller services, such as handling images, reviews, or hours independently, optimizing for scalability.
      • Pagination in APIs: Pagination ensures that a system doesn’t overload the user or the backend by fetching all results at once. This is critical in scenarios with thousands of results.
    • Example: A user clicks on a restaurant listing to view its reviews, photos, and average wait times on Yelp.

Step 1: Understanding the Problem

Functional Requirements:

Non-Functional Requirements:


Step 2: Proposing a High-Level Design

API Design:

APIs follow the RESTful design principles for communication between the client and server. Typical API endpoints:

High-Level System Components:

  1. Load Balancer: Distributes requests evenly across servers to prevent overloading a single server.
    • Relevant Knowledge:
      • Global Load Balancers: Ensuring requests are routed based on the geographical proximity to reduce latency.
      • Sticky Sessions: Keeping users on the same server during the session to minimize state loss.
  2. Location-Based Service (LBS): This service is responsible for querying the database for nearby businesses within the user-defined radius.
    • Relevant Knowledge:
      • Geospatial Queries: Using geospatial databases like PostGIS or Redis Geospatial to efficiently retrieve location data.
      • Real-Time Data: LBS may fetch real-time data from location databases or cache results for quicker retrieval.
  3. Business Service: Manages operations related to adding, updating, or deleting businesses.
    • Relevant Knowledge:
      • Event-Driven Architecture: Updates can be pushed to the system via event queues (e.g., Kafka), ensuring asynchronous updates and scalability.
  4. Database Cluster: Data is stored using a primary-replica architecture. The primary database handles writes, while replicas handle read operations.
    • Relevant Knowledge:
      • Replication Delays: Understanding the slight delay between the primary and replicas, which might cause the latest updates to be briefly unavailable in read queries.

Algorithms for Finding Nearby Businesses

  1. Two-Dimensional Search:
    • Uses a simple circle-based method to find all businesses within the user-defined radius.
    • Relevant Knowledge:
      • Inefficiency for Large Datasets: Scanning all records for each query can be computationally expensive as the number of businesses grows.
      • Bounding Box: Instead of a circle, you can define a bounding box to narrow down the search quickly.
  2. Geospatial Indexing:
    • Used to efficiently organize and query location data.
    Types:
    • Geohash: Converts latitude and longitude into a string for indexing, reducing 2D data to 1D.
      • Relevant Knowledge:
        • Precision Trade-offs: A higher precision geohash (longer string) results in smaller grid sizes, allowing more accurate queries.
      • Example: Searching within a small grid that covers only a few city blocks.
    • Quadtree: Divides the world into four quadrants, dynamically subdividing based on density.
      • Relevant Knowledge:
        • Dynamic Grids: The grid size adjusts based on how dense the area is with businesses—smaller in cities, larger in rural areas.
      • Example: For dense cities like New York, the quadtree can create highly granular grids to optimize queries.

Step 3: Deep Dive – Caching & Database Scaling

Caching:

Database Scaling:


Step 4: Final Design & Wrap-Up

The final system design combines geospatial indexing (Geohash or Quadtree), caching (Redis), and database replication to handle millions of users performing location-based searches. By leveraging these technologies, the system ensures scalability, low latency, and data privacy.

Extended Example:

This comprehensive design ensures that the proximity service can scale efficiently while offering low-latency responses and handling real-time data updates responsibly.

Exit mobile version