Skip to content

Accessing COG Data

Examples on how to use python libraries to access and use COGs. Demonstrating one of the cloud optimized advantages, being able to access a portion of the data without having to download the entire file.

See https://guide.cloudnativegeo.org/

Using rasterio

pip install rasterio

The developers of the rasterio library provide additional examples of usage. For more details on the rasterio library, see the rasterio documentation.

Note

The following examples starts with a request to the CCMEO STAC API via the pystac-client library.

For more details on how to discover data through the STAC API, see the Interacting with CCMEO STAC API section

Read the header of a remote COG

In this code you will :

  • Query a STAC API with pystac-client to get link to a COG;
  • Read header metadata from a remote COG file

Info

This specific example uses the collection mrdem-30 from CCMEO's datacube

import pystac_client
import rasterio 

# Define a bounding box for an AOI (Ottawa) in EPSG:4326
bbox = [-75.8860,45.3157,-75.5261,45.5142]
bbox_crs = "EPSG:4326"

# Link to ccmeo datacube stac-api
stac_root = "https://datacube.services.geo.ca/stac/api"
catalog = pystac_client.Client.open(stac_root)

search = catalog.search(
    collections=['mrdem-30'], 
    bbox=bbox,
    ) 

# Get the link to the data asset for mrdem-30 dtm
links = []
for page in search.pages():
    for item in page:
        links.append(item.assets['dtm'].href)

# Read the header of a remote COG 
with rasterio.open(links[0]) as src:
    # Read the header of a COG
    print(src.tags())
    # >> {'AREA_OR_POINT': 'Area', 
    #     'TIFFTAG_DATETIME': '2024:06:13 12:00:00'}
    print(src.profile)
    # >> {'driver': 'GTiff', 'dtype': 'float32', 'nodata': -32767.0, 
    #   'width': 183687, 'height': 159655, 'count': 1, 
    #   'crs': CRS.from_epsg(3979), 'transform': Affine(30.0, 0.0, 
    #   -2454000.0, 0.0, -30.0, 3887400.0), 'blockxsize': 512, 
    #   'blockysize': 512, 'tiled': True, 'compress': 'lzw', 
    #   'interleave': 'band'}

Read a portion of a remote COG

In this code you will :

  • Query a STAC API with pystac-client to get link to a COG;
  • Read a portion of a remote COG based on an AOI with the window functionality;
  • Optionally write the portion locally inside a .tif file

Info

This specific example uses the collection mrdem-30 from CCMEO's datacube

Tip

COG'S file contains internal tiling that can be leverage by iterating on the src.block_windows() while reading. If the reading window does not align with the internal tiling of the file, the data will be resampled.

Example : https://rasterio.readthedocs.io/en/stable/topics/windowed-rw.html#blocks

API definition: https://rasterio.readthedocs.io/en/stable/topics/windowed-rw.html#blocks

import pystac_client
import rasterio 
from rasterio.windows import from_bounds
from rasterio.warp import transform_geom
from shapely.geometry import box, shape


# Define a bounding box for an AOI (Ottawa) in EPSG:4326
bbox=[-75.8860,45.3157,-75.5261,45.5142]
bbox_crs = "EPSG:4326"

# Link to ccmeo datacube stac-api
stac_root = "https://datacube.services.geo.ca/stac/api"
catalog = pystac_client.Client.open(stac_root)

search = catalog.search(
    collections=['mrdem-30'], 
    bbox=bbox,
    ) 

# Get the link to the data asset for mrdem-30 dtm
links = []
for page in search.pages():
    for item in page:
        links.append(item.assets['dtm'].href)

# Read AOI from the first COG
with rasterio.open(links[0]) as src:
    # Transform bbox to src EPSG
    transformed_bbox = shape(transform_geom(src.crs, bbox_crs, 
                                                box(*bbox))).bounds
    # Define the window to read the values
    window=from_bounds(transformed_bbox[0], transformed_bbox[1], 
                       transformed_bbox[2], transformed_bbox[3], 
                       src.transform)
    # Read value from file
    rst = src.read(1, window=window)

    # Copy and update the source metadata to write the output tiff
    metadata = src.meta.copy()
    metadata.update({
        'height': window.height,
        'width': window.width,
        'transform': rasterio.windows.transform(window, src.transform)
    }) 

# Perform analysis ...

# Write the output array to a tiff file
output_tiff = r"path/to/output.tif"
with rasterio.open(output_tiff, 'w', **metadata) as dst:
    dst.write(rst)