Location-Allocation (P-Median)

The location-allocation problem answers the question: given N potential sites, where should I place P facilities to minimize total customer travel? It’s the mathematical core of network planning for retail chains, hospitals, fire stations, and any service with physical locations.

The P-Median Problem

Minimize Z = Σᵢ Σⱼ (wᵢ × dᵢⱼ × xᵢⱼ)

Subject to:

Each demand point i is assigned to exactly one facility j
Exactly P facilities are open
A demand point can only be assigned to an open facility

Where:

wᵢ = demand (population, spending power) at point i
dᵢⱼ = distance/travel time from demand point i to facility j
xᵢⱼ = 1 if point i is served by facility j, 0 otherwise
P = number of facilities to locate

“Facility location problems involve flows… the p-median problem minimizes total weighted distance.” — de Smith, Goodchild & Longley, Geospatial Analysis, §7.4

Data Requirements

Dataset	What We Use	Link
Census Population	Demand points (population per zone)	View →
MTR Stations & Lines	Network distance matrix between demand and supply points	View →
FEHD Restaurant Licences	Existing facility locations (competitor network)	View →
Rental Indices (RVD)	Facility cost constraints	View →

Variants

Model	Objective	Use Case
P-median	Minimize average distance	Retail chains, post offices
P-centre	Minimize maximum distance	Emergency services, hospitals
Coverage	Maximize population within threshold distance	Ambulance stations, cell towers
Maximal coverage	Cover max demand with budget constraint	Any service with fixed budget

How It Works

Define demand points — census blocks, residential areas, each with a population weight
Define candidate sites — all possible locations you’d consider opening
Calculate distance matrix — travel time or distance from every demand point to every candidate
Run optimization — find the P sites that minimize total weighted distance
Assign customers — each demand point goes to its nearest open facility

Heuristic Solutions

de Smith et al. describe three practical approaches:

1. Greedy Addition

Start with no facilities. Add the one that reduces total distance the most. Repeat P times. Fast but can get stuck in local optima.

2. Interchange (Teitz-Bart)

Start with P random facilities. Try swapping each one with a non-selected candidate. Accept swaps that reduce total distance. Repeat until no improvement. Better solutions, slower convergence.

3. Lagrangian Relaxation

Relax the assignment constraint, solve a simpler problem, then iteratively tighten. Provides bounds on how close your solution is to optimal.

Single-Site Relevance

For evaluating one specific site, location-allocation answers a critical question:

“Is this site in the optimal set?”

If you run p-median for P=5 branches in a district, and your target site doesn’t appear in any solution, that’s a strong signal. If it consistently appears, that validates the location.

Strengths & Limitations

Strengths:

Mathematically rigorous optimization
Considers the entire network simultaneously (not just one catchment)
Answers “where” not just “how good”
Scales to regional and national planning

Limitations:

Requires complete distance matrix (expensive to compute)
NP-hard — heuristics needed for real-world size
Assumes customers always go to nearest facility (ignores brand preference, quality)
Static — doesn’t account for time-varying demand

Relationship to Other Models

The p-median model pairs naturally with the Huff Model and Spatial Interaction Model:

P-median → identifies where to locate
Huff/SIM → predicts how much revenue each location captures
Agent simulation → tests what happens when consumers make complex choices

Implementation Notes

Current Implementation (2026-03-25)

Demand point generation: 50 demand points distributed around the district centroid within a 2km radius using distributeAround(). These represent the spatial distribution of potential customers.

Demand coverage: Fraction of district area covered by an 800m walk radius:

demandCoverage = round((π × 0.8²) / totalDistrictArea × 100)

Where totalDistrictArea = totalPop / max(1, density) in km². Capped at 100%.

Greedy P-median search — tests candidate locations in order:

District centroid
Midpoint between current lat/lng and centroid
10 demand points directly (first 10 of the 50 generated)

Any candidate within 100m of an existing competitor is skipped. The candidate minimizing total summed distance to all 50 demand points is selected as “optimal”.

Optimality score: max(10, min(100, round(100 × exp(-distFromOptimal / 800))))

At optimal point → 100
800m from optimal → ~37
Range hard-capped at 10–100

Weighted average distance: Mean haversine distance from all 50 demand points to the evaluated location (in metres).

Changelog

Date	Change	Why
2026-03-25	Optimality score range clamped to 10–100 (`max(10, min(100, ...))`)	Remote locations with distFromOptimal >4km were scoring 0 or negative; minimum 10 preserves interpretability
2026-03-25	Demand coverage formula uses `π×0.8²/districtArea` (not radius buffer count)	Counting demand points in buffer was noisy for small districts; area fraction is deterministic
2026-03-24	Initial implementation	—

Source

📖 de Smith, M.J., Goodchild, M.F. & Longley, P.A. (2024). Geospatial Analysis: A Comprehensive Guide. §7.4: Facility Location — P-Median and P-Centre Problems.