Configuration Reference
All pipeline parameters are defined in config.env at the repository root.
hls_pipeline.sh sources this file before dispatching each step; Python scripts
read values via os.environ.get() with per-parameter fallback defaults.
Paths
Parameter |
Default |
Description |
|---|---|---|
|
(required) |
Root directory for all pipeline inputs and outputs |
|
|
Directory for pipeline log files |
Output Directories
All paths are relative to BASE_DIR by default.
Parameter |
Default path |
Description |
|---|---|---|
|
|
Downloaded raw HLS band and Fmask GeoTIFFs (step 01 output) |
|
|
Per-granule VI GeoTIFFs (step 02 output) |
|
|
Per-tile NetCDF time-series files (step 03 output) |
|
|
Reprojected temporal mean tiles (step 04 output) |
|
|
Reprojected outlier mean + count tiles (step 05 output) |
|
|
Study-area-wide mosaic GeoTIFFs (steps 06–09 output) |
|
|
Multi-band time-window stacks (step 10 output) |
|
|
Outlier point GeoPackage files (step 11 output) |
Processing Parameters
Parameter |
Default |
Description |
|---|---|---|
|
|
Parallel worker processes for compute-intensive steps (02, 04, 05, 09, 10, 11) |
|
|
Tiles loaded per dask chunk during xarray processing (steps 04, 05, 09, 10) |
|
|
Output CRS for all reprojected and mosaicked products (steps 04–11). Must be a projected CRS (linear units such as metres). A geographic CRS (degrees) is accepted but produces a |
Output Format
Parameter |
Default |
Description |
|---|---|---|
|
|
zlib compression level for NetCDF time-series files (step 03). Range 0–9: |
|
|
Compression codec for all GeoTIFF outputs (steps 02, 04–10). Any codec supported by your GDAL build: |
|
|
Internal tile block dimension (pixels) for all tiled GeoTIFF outputs (steps 04–10). Must be a power of two. |
Vegetation Indices
PROCESSED_VIS="NDVI EVI2 NIRv"
Space-separated list of vegetation indices to process end-to-end. All listed VIs flow through every active pipeline step.
Value |
Formula |
Typical range |
Band requirements |
|---|---|---|---|
|
|
−1.0 to 1.0 |
B05/B8A, B04, Fmask |
|
|
−1.0 to 2.0 |
B05/B8A, B04, Fmask |
|
|
−0.5 to 1.0 |
B05/B8A, B04, Fmask |
Pipeline Step Control
STEPS="all"
Controls which pipeline stages run. Accepts named steps (space-separated, any order, any combination) or convenience aliases.
Named steps
Value |
Step |
Script |
Description |
|---|---|---|---|
|
01 |
|
Query NASA CMR API; download raw HLS bands and Fmask |
|
02 |
|
Compute VI GeoTIFFs from raw bands; apply Fmask masking |
|
03 |
|
Aggregate per-granule VI GeoTIFFs into per-tile CF-1.8 NetCDF time-series |
|
04 |
|
Temporal mean per tile; reproject to |
|
05 |
|
Outlier-aware mean + valid count per tile; reproject |
|
06 |
|
Mosaic per-tile means into a single study-area-wide GeoTIFF |
|
07 |
|
Mosaic outlier-filtered mean tiles |
|
08 |
|
Mosaic outlier pixel count tiles |
|
09 |
|
Count valid observations per pixel across all download cycles; mosaic result |
|
10 |
|
Multi-band time-window stacks defined by |
|
11 |
|
Export per-pixel outlier observations to a GeoPackage point vector file |
Convenience aliases
Alias |
Expands to |
Use case |
|---|---|---|
|
Steps 01–11 |
Full pipeline from scratch |
|
Steps 02–11 |
Raw data already downloaded |
|
Steps 01–03 |
Download through NetCDF only |
|
Steps 06–08 |
Re-mosaic only (tiles already reprojected) |
|
Steps 05+07+08+11 |
Re-run the full outlier chain |
Examples
STEPS="all" # Full pipeline from scratch
STEPS="products" # Raw data exists, build everything
STEPS="build_nc" # Download through NetCDF only
STEPS="timeseries" # Re-run only the time-series step
STEPS="mosaics" # Re-mosaic after fixing a tile
STEPS="outliers" # Re-run full outlier chain
STEPS="outlier_gpkg" # Export outlier points only
STEPS="count_valid_mosaic" # CountValid mosaic only
STEPS="netcdf mean_flat mean_mosaic" # NetCDF through mean mosaic only
STEPS="mean_flat outlier_flat mosaics timeseries" # Add a new VI (NetCDFs exist)
Space Saver Options
These options only fire per tile when step 03 (netcdf) is active in the
current run. Both flags are safe to enable together.
Parameter |
Values |
Default |
Description |
|---|---|---|---|
|
|
|
Delete downloaded HLS band + Fmask files from |
|
|
|
Delete per-granule VI GeoTIFFs from |
Download Approval
Before any data is downloaded, the pipeline prints a storage estimate and prompts for confirmation. To bypass this prompt in automated or non-interactive contexts:
Parameter |
Values |
Default |
Description |
|---|---|---|---|
|
|
|
Bypass the interactive download approval prompt. Set |
Download Settings
Parameter |
Default |
Description |
|---|---|---|
|
|
Maximum cloud coverage percentage for CMR API granule filtering (0–100) |
|
|
Minimum spatial coverage percentage for CMR API granule filtering (0–100) |
Band Selection
Defines which bands to download for each HLS sensor. Fmask is always required
for quality masking.
Parameter |
Default |
Description |
|---|---|---|
|
|
Landsat bands: NIR (B05), Red (B04), quality mask |
|
|
Sentinel-2 bands: NIR narrow (B8A), Red (B04), quality mask |
Full band reference:
Sensor |
Band |
Wavelength |
Role |
|---|---|---|---|
L30 |
B04 |
Red |
Required by NDVI, EVI2, NIRv |
L30 |
B05 |
NIR |
Required by NDVI, EVI2, NIRv |
L30 |
B02 |
Blue |
Only needed for 3-band EVI (not currently used) |
S30 |
B04 |
Red |
Required by NDVI, EVI2, NIRv |
S30 |
B8A |
NIR narrow |
Required by NDVI, EVI2, NIRv |
S30 |
B02 |
Blue |
Only needed for 3-band EVI (not currently used) |
The pipeline validates at startup that all bands required for the selected
PROCESSED_VIS are present in these lists before any step executes.
Fmask Quality Masking
Individual bit flags for the HLS Fmask quality band. Set TRUE to mask
(exclude) pixels with the corresponding condition.
Parameter |
Fmask bit |
Default |
Description |
|---|---|---|---|
|
Bit 0 |
|
Mask cirrus cloud pixels |
|
Bit 1 |
|
Mask cloud pixels |
|
Bit 2 |
|
Mask pixels adjacent to cloud |
|
Bit 3 |
|
Mask cloud shadow pixels |
|
Bit 4 |
|
Mask snow and ice pixels |
|
Bit 5 |
|
Mask open water pixels |
|
Bits 6–7 |
|
Aerosol masking threshold (see below) |
Aerosol modes:
Mode |
Behavior |
|---|---|
|
Mask only high-aerosol pixels (general use) |
|
Mask high + moderate aerosol (recommended for VIs) |
|
Mask all non-zero aerosol pixels |
|
No aerosol masking |
Note
HLS_SCALE_FACTOR=0.0001 is the HLS surface reflectance scale factor applied
during VI calculation (step 02). This value reflects the NASA HLS v2.0 data
specification and should not be changed.
Valid Range Bounds
Pixels outside these bounds are treated as outliers in steps 05, 07, 08, 09,
10, and 11. Format: "min,max" (no spaces).
Parameter |
Default |
Scientific basis |
|---|---|---|
|
|
Bounded by definition — ratio of two bands of equal magnitude at the extremes |
|
|
Captures all physically plausible values while rejecting noise; EVI2 can exceed 1.0 over bright or noisy surfaces |
|
|
Rejects implausible negative values while preserving all legitimate high-vegetation values (dense tropical canopy ~0.5–0.6) |
Adjust these thresholds if your study region has atypical surface conditions (e.g., snow/ice, salt flats, open water).
Tile List
HLS_TILES="17TNE 17TNF 17TPE"
Space-separated list of MGRS tile IDs to process. Enforced uniformly across all 11 pipeline steps — step 01 uses it for CMR API queries; steps 02–11 filter all file globs against it immediately after each glob call.
If HLS_TILES is unset or empty, no tile filtering is applied and all
discovered files are processed.
Download Cycles
DOWNLOAD_CYCLES="2020-01-01|2020-12-31 2021-01-01|2021-12-31"
Space-separated list of date ranges in YYYY-MM-DD|YYYY-MM-DD format. Step 01
queries and downloads each range as a separate cycle. Multiple cycles allow
non-contiguous time periods (e.g., winter-only seasons across multiple years).
Time-Series Windows
Controls step 10 (timeseries), which produces multi-band composite stacks
where each band represents one named time window.
Parameter |
Values |
Default |
Description |
|---|---|---|---|
|
|
|
Must be |
|
|
|
Statistic computed per pixel per window |
TIMESLICE_WINDOWS="label:YYYY-MM-DD|YYYY-MM-DD ..."
Space-separated list of named date windows. Each token is label:start|end where:
label — alphanumeric + underscores only; becomes the band description in the output stack
start / end — inclusive date bounds (
YYYY-MM-DD); start must be ≤ end
Examples:
# Wet / dry seasons
TIMESLICE_WINDOWS="wet_2020:2020-11-01|2021-04-30 dry_2021:2021-05-01|2021-10-31"
# Monthly slices (outlier forensics)
TIMESLICE_WINDOWS="jan_2021:2021-01-01|2021-01-31 feb_2021:2021-02-01|2021-02-28"
# Annual composites
TIMESLICE_WINDOWS="yr_2016:2016-01-01|2016-12-31 yr_2017:2017-01-01|2017-12-31"