How to load air-routes data from https://github.com/krlawrence/graph/tree/master/sample-data

1. Download air-routes-latest-edges.csv and air-routes-latest-nodes.csv

2. Edit air-routes-latest-nodes.csv

##### original
~id,~label,type:string,code:string,icao:string,desc:string,region:string,runways:int,longest:int,elev:int,country:string,city:string,lat:double,lon:double,author:string,date:string
0,version,version,0.89,,Air Routes Data - Version: 0.89 Generated: 2022-08-29 14:10:18 UTC; Graph created by Kelvin R. Lawrence; Please let me know of any errors you find in the graph or routes that should be added.,,,,,,,,,Kelvin R. Lawrence,2022-08-29 14:10:18 UTC

##### edited
id,label,type,code,icao,desc,region,runways,longest,elev,country,city,lat,lon,author,date
1,airport,airport,ATL,KATL,Hartsfield - Jackson Atlanta International Airport,US-GA,5,12390,1026,US,Atlanta,33.6366996765137,-84.4281005859375,,

# (1) remove the second line
# (2) edit the first line (CSV Header)

3. Edit air-routes-latest-edges.csv

##### original
~id,~from,~to,~label,dist:int
3749,1,3,route,809

##### edited
id,start_id,end_id,label,dist,start_vertex_type,end_vertex_type
3749,1,3,route,809,airport,airport

# (1) edit the first line (CSV Header)
# (2) add start_vertex_type and end_vertex_type columns to each lines

4. Install agefreighter with pip on your Python environment
* with python venv
mkdir your_project
cd your_project
python3 -m venv .venv
source .venv/bin/activate
pip install agefreighter

* with uv
uv init your_project
cd your_project
uv venv
source .venv/bin/activate
uv add agefreighter

5. Make a Python script as below and locate the script in the same directory with the CSV files

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import logging
import os
from agefreighter import Factory

log = logging.getLogger(__name__)
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s - %(name)s - %(levelname)s - %(message)s",
)


async def main():
    log.info("Loading...")
    loader = Factory.create_instance("MultiCSVFreighter")
    await loader.connect(
        dsn=os.environ["PG_CONNECTION_STRING"],
        max_connections=64,
    )
    await loader.load(
        vertex_csv_paths=["air-routes-latest-nodes.csv"],
        vertex_labels=["airport"],
        edge_csv_paths=["air-routes-latest-edges.csv"],
        edge_types=["route"],
        graph_name="air_route",
        use_copy=True,
        drop_graph=True,
    )


if __name__ == "__main__":
    import asyncio

    asyncio.run(main())

6. Deploy Azure Database for PostgreSQL and enable Apache AGE extension on Azure Portal
https://techcommunity.microsoft.com/blog/adforpostgresql/introducing-support-for-graph-data-in-azure-database-for-postgresql-preview/4275628

7. Set the PostgreSQL connection string as an environment variable
export PG_CONNECTION_STRING="host=xxxxxx.postgres.database.azure.com port=5432 dbname=postgres user=......"

8. Run the script
python3 <script_name>.py

9. Check the graph
% psql $PG_CONNECTION_STRING
psql (16.6 (Homebrew), server 16.4)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

postgres=> SET search_path = ag_catalog, "$user", public;
SET
postgres=> select * from air_route.airport limit 1;
       id        |                                                                                                                                                                       properties
-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 844424930131969 | {"id": "1", "lat": "33.6366996765137", "lon": "-84.4281005859375", "city": "Atlanta", "code": "ATL", "date": "nan", "desc": "Hartsfield - Jackson Atlanta International Airport", "elev": "1026.0", "icao": "KATL", "type": "airport", "label": "airport", "author": "nan", "region": "US-GA", "country": "US", "longest": "12390.0", "runways": "5.0"}
(1 row)

postgres=> select * from air_route.route limit 1;
        id        |    start_id     |     end_id      |                    properties
------------------+-----------------+-----------------+---------------------------------------------------
 1125899906842625 | 844424930131969 | 844424930131971 | {"id": "3749", "dist": "809.0", "label": "route"}
(1 row)
