slice icon Context Slice

Extracting Flight Data from Email

Guide for identifying actual flight bookings and extracting structured information from Gmail.

For Gmail search query patterns, see sliceGmail Search Patterns (Travel > Flights section).

Identifying Flight Confirmations

Not every email with "flight" in the subject is a flight confirmation. Use inference judgment to filter.

Actual flight confirmations contain:

  • Flight numbers (e.g., "UA 123", "BA456")
  • PNR/confirmation codes (6 alphanumeric characters)
  • Specific departure/arrival times and airports
  • Airline name or logo
  • Passenger name matching the user

Exclude:

  • Hotel-only bookings (no flight info)
  • Car rental confirmations
  • Flight deal newsletters and promotions
  • Price alerts and fare tracking emails
  • Generic "thank you for booking" without flight details

Extracting Flight Details

For each confirmed flight email, extract:

Field What to Look For Example
Departure City Airport code (3 letters) or city name near "from", "depart" SFO, San Francisco
Arrival City Airport code or city name near "to", "arrive" CDG, Paris
Departure Date Date near departure info, format varies Jan 15, 2024, 2024-01-15
Departure Time Time in 12h or 24h format 10:30 AM, 14:45
Arrival Date Often same as departure, check for overnight Jan 16, 2024
Arrival Time Time at destination (local time) 07:00
Airline Carrier name or 2-letter code United, UA
Flight Number Alphanumeric, often after airline UA 837
Confirmation Code 6-character PNR ABC123

Airport to Country Mapping

Common airport codes and their countries:

North America: JFK/EWR/LGA (USA-NY), LAX/SFO (USA-CA), ORD (USA-IL), YYZ (Canada), MEX (Mexico)

Europe: LHR/LGW (UK), CDG/ORY (France), FRA/MUC (Germany), AMS (Netherlands), FCO (Italy), MAD/BCN (Spain), ZRH (Switzerland)

Asia: NRT/HND (Japan), ICN (South Korea), PEK/PVG (China), SIN (Singapore), BKK (Thailand), HKG (Hong Kong)

Middle East: DXB (UAE), DOH (Qatar), TLV (Israel)

Oceania: SYD/MEL (Australia), AKL (New Zealand)

For unlisted airports, use the city name to determine country.

Handling Ambiguity

Multiple segments: A single confirmation may have multiple flights (outbound + return, or connections). Extract each segment separately.

Missing data: If departure time is missing but date is known, record the date. If airport code is ambiguous, prefer the full city name.

Duplicates: The same flight may appear multiple times (confirmation + reminder + check-in). Deduplicate by: same date + same route + same flight number.

Output Format

Write extracted flights as JSON array:

[
  {
    "source": "email",
    "confirmationCode": "ABC123",
    "airline": "United",
    "flightNumber": "UA 837",
    "departureCity": "San Francisco",
    "departureAirport": "SFO",
    "departureCountry": "USA",
    "departureDate": "2024-01-15",
    "departureTime": "10:30",
    "arrivalCity": "Paris",
    "arrivalAirport": "CDG",
    "arrivalCountry": "France",
    "arrivalDate": "2024-01-16",
    "arrivalTime": "07:00"
  }
]

Sort by departure date ascending. Include all fields even if some are null.