Architecture¶
Overview¶
vibeProj is a pure Python + CuPy library – no compiled C/C++ extensions, no CMake. GPU kernels are compiled at runtime via CuPy’s NVRTC interface.
Transformer.from_crs("EPSG:4326", "EPSG:32631")
│
▼
┌─────────────┐
│ crs.py │ pyproj resolves EPSG → projection params
└──────┬──────┘
│
▼
┌──────────────┐
│ pipeline.py │ chains pre/post ops with projection core
└──────┬───────┘
│
├──── GPU? ──► fused_kernels.py (single kernel launch)
│
└──── CPU? ──► projections/<name>.py (NumPy element-wise)
Module map¶
Module |
Responsibility |
|---|---|
|
Public API. |
|
CRS resolution via pyproj. Extracts projection type + parameters from EPSG codes. Maps pyproj method names to internal names via |
|
Transform pipeline. Chains axis swap, deg/rad, central meridian, projection core, scale, offset. Contains |
|
40 CUDA kernel source strings (20 projections x fwd/inv). Compiled and cached via CuPy |
|
NumPy/CuPy element-wise implementations. Each is a |
|
Reference ellipsoid definitions (WGS84, GRS80, sphere). |
|
Helmert 7/15-parameter datum transformation. |
|
GPU/CPU detection and array module selection. |
|
Consumer vs datacenter GPU classification. |
|
Double-single fp32 arithmetic CUDA device functions. |
Transform pipeline stages¶
A forward transform (geographic -> projected) executes these stages:
Axis swap – CRS-dependent. EPSG:4326 is (lat, lon); some projected CRS are (E, N).
Datum shift – Helmert 7/15-parameter (if cross-datum). Converts geodetic coords from source ellipsoid to destination ellipsoid via ECEF intermediate. When z (ellipsoidal height) is provided, it is included in the ECEF conversion and recovered on the destination ellipsoid. Skipped entirely when
helmert is None(same-datum). Projection stages (3-8) are inherently 2D — z passes through unchanged.Degree to radian –
lat * pi/180,lon * pi/180.Central meridian –
lon -= lon_0, wrapped to [-pi, pi].Projection core – the actual math (Transverse Mercator, Lambert, etc.).
Scale – multiply by semi-major axis
a.False easting/northing – add
x_0,y_0.Output axis swap – match destination CRS axis order.
Inverse transforms reverse these stages.
For proj-to-proj transforms (projected -> projected), the pipeline decomposes into: inverse(src) -> datum shift (if cross-datum) -> forward(dst). Each sub-step may use its own fused GPU kernel, with the Helmert shift running as a separate kernel launch between them.
Fused kernel fast-path¶
When _try_fused() detects a CuPy array input and a supported projection,
it dispatches to a single GPU kernel that performs all 7 pipeline stages
in one kernel launch. This eliminates:
~20 intermediate CuPy kernel launches
~20 temporary array allocations
Multiple global memory round-trips
The xp (NumPy/CuPy element-wise) path runs the same stages as individual operations. It serves as the CPU fallback and the reference for testing.
Lazy CuPy imports¶
CuPy is always imported lazily to keep vibeProj usable without a GPU:
# In pipeline.py
def _get_cupy():
global _cupy_module
if _cupy_module is None:
try:
import cupy
_cupy_module = cupy
except ImportError:
_cupy_module = False
return _cupy_module if _cupy_module is not False else None
This pattern is used throughout the codebase. Never add a top-level
import cupy to any module.