GeoHash

“Find all restaurants within 1km of my location.” The database has 50 million restaurant records with latitude and longitude columns. A naive query computes the Haversine distance from your location to every single restaurant — a full table scan. GeoHash solves this by encoding a 2D coordinate into a 1D string so that nearby points share a common prefix, turning a spatial proximity query into a simple string prefix match.

How GeoHash Encoding Works

GeoHash recursively bisects the world map, alternating between longitude and latitude, encoding each bisection as a bit. The bits are then grouped into characters using Base32.

Step-by-Step Encoding

Encode latitude 40.7128 (New York City), longitude -74.0060.

Longitude bits (range: -180 to +180):

Step 1: [-180, 180] → midpoint 0     → -74.006 < 0  → bit 0 → range [-180, 0]
Step 2: [-180, 0]   → midpoint -90   → -74.006 > -90 → bit 1 → range [-90, 0]
Step 3: [-90, 0]    → midpoint -45   → -74.006 < -45 → bit 0 → range [-90, -45]
Step 4: [-90, -45]  → midpoint -67.5 → -74.006 < -67.5 → bit 0 → range [-90, -67.5]
Step 5: [-90, -67.5]→ midpoint -78.75→ -74.006 > -78.75→ bit 1 → range [-78.75, -67.5]
...
Longitude bits: 0 1 0 0 1 ...

Latitude bits (range: -90 to +90):

Step 1: [-90, 90]   → midpoint 0    → 40.7128 > 0   → bit 1 → range [0, 90]
Step 2: [0, 90]     → midpoint 45   → 40.7128 < 45  → bit 0 → range [0, 45]
Step 3: [0, 45]     → midpoint 22.5 → 40.7128 > 22.5→ bit 1 → range [22.5, 45]
Step 4: [22.5, 45]  → midpoint 33.75→ 40.7128 > 33.75→ bit 1 → range [33.75, 45]
Step 5: [33.75, 45] → midpoint 39.375→ 40.7128 > 39.375→ bit 1 → range [39.375, 45]
...
Latitude bits: 1 0 1 1 1 ...

Interleave (longitude bit, latitude bit, longitude bit, …):

Longitude: 0 1 0 0 1 ...
Latitude:  1 0 1 1 1 ...
Interleaved: 01 10 01 01 11 ... = 01100 10111 ...

Base32 encode each 5-bit group into a character: 01100 = “d”, 10111 = “r”, … → GeoHash string: "dr5ru...".

Precision and Cell Size

Each additional character halves the cell area. More characters = smaller cell = higher precision.

GeoHash Length	Cell Width	Cell Height	Approximate Area
1	5,000 km	5,000 km	~25,000,000 km²
2	1,250 km	625 km	~781,000 km²
3	156 km	156 km	~24,000 km²
4	39 km	19.5 km	~760 km²
5	4.9 km	4.9 km	~24 km²
6	1.2 km	0.6 km	~0.7 km²
7	153 m	153 m	~0.023 km²
8	38 m	19 m	~700 m²

Rule of thumb: 6 characters for ~1km precision queries (nearby restaurants), 8 characters for ~40m precision (street-level).

The Prefix Property

The key insight: points in the same GeoHash cell share the same GeoHash prefix. If two restaurants both have GeoHash "dr5ru7", they are within the same ~1km cell. If they share prefix "dr5ru", they are within the same ~5km cell.

Restaurant A: lat=40.7128, lng=-74.0060 → GeoHash = "dr5ru7h3"
Restaurant B: lat=40.7135, lng=-74.0052 → GeoHash = "dr5ru7h8"
                                          Shared prefix: "dr5ru7h"  → within ~150m

Restaurant C: lat=40.7580, lng=-73.9855 → GeoHash = "dr5rueg5"
                                          Shared prefix: "dr5ru"    → within ~5km

This means a proximity query becomes a string prefix lookup — the database can use a regular B-tree index on the GeoHash column.

-- Find all restaurants in the same ~1km cell as the user
-- User's GeoHash (6 chars): "dr5ru7"
SELECT * FROM restaurants
WHERE geohash LIKE 'dr5ru7%';

Proximity Queries: The 9-Cell Search

A point near the edge of a cell might be closer to points in a neighboring cell than to points in its own cell. Searching only the target cell misses these.

┌──────────┬──────────┬──────────┐
│ NW       │ N        │ NE       │
│ dr5ru6   │ dr5ru9   │ dr5rud   │
├──────────┼──────────┼──────────┤
│ W        │ ★ Target │ E        │
│ dr5ru3   │ dr5ru7   │ dr5ruc   │
├──────────┼──────────┼──────────┤
│ SW       │ S        │ SE       │
│ dr5ru1   │ dr5ru4   │ dr5ru8   │
└──────────┴──────────┴──────────┘

★ = user's location in cell "dr5ru7"
Search all 9 cells to cover edge cases.

Algorithm:

Compute user’s GeoHash at target precision (e.g., 6 chars)
Compute the 8 neighboring GeoHash cells
Query all 9 cells: WHERE geohash IN ('dr5ru7%', 'dr5ru6%', 'dr5ru9%', ...)
Post-filter: compute exact Haversine distance and discard results beyond the desired radius

import geohash2 as geohash

def find_nearby(lat, lng, precision=6):
    center = geohash.encode(lat, lng, precision)
    neighbors = geohash.neighbors(center)  # returns 8 neighboring cells
    
    search_cells = [center] + neighbors  # 9 cells total
    
    # Query DB with prefix match on all 9 cells
    query = "SELECT * FROM restaurants WHERE " + " OR ".join(
        f"geohash LIKE '{cell}%'" for cell in search_cells
    )
    
    candidates = db.execute(query)
    
    # Post-filter by exact distance
    return [
        r for r in candidates
        if haversine(lat, lng, r.lat, r.lng) <= desired_radius_km
    ]

⚠️

Edge-case pitfall: neighboring GeoHash cells don’t always share a common prefix. In the grid above, dr5ru7 and dr5rud are neighbors but differ in the last character. Near the boundaries of large cells (e.g., at the equator or the prime meridian), neighbors can have completely different prefixes. The 9-cell search with computed neighbors handles this correctly — never rely solely on prefix matching for proximity.

Redis GEO Commands

Redis implements GeoHash internally and provides high-level geospatial commands. Under the hood, coordinates are stored as GeoHash-encoded scores in a sorted set.

import redis
r = redis.Redis()

# Add locations — Redis encodes lat/lng as GeoHash internally
r.geoadd("restaurants", [
    (-74.0060, 40.7128, "joes-pizza"),      # lng, lat, member
    (-74.0052, 40.7135, "katz-deli"),
    (-73.9855, 40.7580, "central-park-cafe"),
])

# Find all restaurants within 1km of a point
nearby = r.georadius(
    "restaurants",
    longitude=-74.005,
    latitude=40.713,
    radius=1,
    unit="km",
    withcoord=True,    # return coordinates
    withdist=True,     # return distance
    count=10,          # limit results
    sort="ASC"         # nearest first
)

# Result: [("joes-pizza", 0.12, (-74.006, 40.7128)),
#          ("katz-deli", 0.08, (-74.0052, 40.7135))]

How GEORADIUS works internally:

Compute GeoHash for the center point at appropriate precision
Compute neighboring cells (the 9-cell search)
Scan the sorted set for members in those cells (using ZRANGEBYSCORE on GeoHash scores)
Filter by exact distance using the Haversine formula
Sort by distance and return top-k

Command	Purpose
`GEOADD`	Add member with coordinates
`GEORADIUS`	Find members within radius of a point
`GEORADIUSBYMEMBER`	Find members within radius of another member
`GEODIST`	Distance between two members
`GEOHASH`	Return the GeoHash string for a member
`GEOPOS`	Return coordinates of a member
`GEOSEARCH`	(Redis 6.2+) Replaces GEORADIUS with more flexible search box/circle

Limitations

Uneven Cell Sizes at High Latitudes

GeoHash cells are defined by equal angular divisions. Near the poles, 1° of longitude covers much less distance than at the equator. A GeoHash cell at 60°N latitude is roughly half as wide in meters as the same-precision cell at the equator.

At equator:  1° longitude ≈ 111 km
At 60°N:     1° longitude ≈ 55 km
At 80°N:     1° longitude ≈ 19 km

Same GeoHash precision = very different physical areas.

For most applications (ride-sharing, restaurant search) this doesn’t matter because the population density at extreme latitudes is low. For global-scale systems, use S2 Geometry (Google) or H3 (Uber) — both use cells with more uniform area.

GeoHash vs Other Spatial Indexes

Property	GeoHash	QuadTree	S2 / H3
Cell shape	Rectangle	Rectangle (adaptive)	Hexagon (H3) / Sphere cap (S2)
Cell size uniformity	Varies with latitude	Adapts to data density	Near-uniform globally
Storage	String column with B-tree index	Custom in-memory tree	64-bit cell ID with B-tree index
Range queries	9-cell prefix search	Tree traversal	Cell union covering
Implementation	Simple — works with any DB	Custom server	Library required (S2/H3)
Best for	Moderate precision, simple infra	Dense/sparse mixed data	Global-scale, uniform precision

ℹ️

Interview tip: When asked about proximity search, start with: “I’d encode locations as GeoHash strings and store them with a B-tree index. For a radius query, I compute the 9 neighbouring GeoHash cells at a precision matching the radius (~6 chars for 1km) and query all 9, then post-filter by exact Haversine distance. For a production system at Uber/Google scale, I’d consider H3 or S2 for more uniform cell areas, but GeoHash with Redis GEO commands works well up to millions of points.” This shows you understand the encoding, the 9-cell search trick, the post-filter step, and when to upgrade.

Test Your Understanding

Two restaurants are 50 meters apart but have different 6-character GeoHash prefixes. Your proximity search misses one. Why?

GeoHash cell boundary problem. GeoHash divides the world into rectangular cells. Two points near a cell boundary can have completely different prefixes despite being physically close. A prefix match only finds points in the same cell.

Fix: Always search the target cell + 8 neighboring cells (the 9-cell grid). This guarantees coverage of nearby points regardless of which side of a boundary they fall on. Then post-filter by exact Haversine distance to remove results outside the desired radius.

GeoHash uses rectangular cells. Near the poles, a 6-character cell covers a vastly different area than at the equator. Why does this matter for a ride-sharing app?

Unequal cell areas. GeoHash cells get narrower (in the east-west direction) as latitude increases because longitude lines converge at the poles. A 6-char cell at the equator covers ~1.2km × 0.6km. Near the Arctic, the same precision covers ~1.2km × 0.1km. A “find drivers within 1km” query needs different precisions at different latitudes.

Alternatives: H3 (Uber’s hexagonal grid) and S2 (Google’s spherical cells) provide more uniform cell areas across latitudes. For global services like Uber or Google Maps, these are preferred. For regional services (US-only restaurant finder), GeoHash’s non-uniformity is negligible.

Search Autocomplete / Typeahead QuadTree