Extract Flights from Email
Progressive filtering extraction - list candidates, identify likely flights, fetch details, extract structured data
Review the email candidates (subjects and snippets only).
Identify which emails are LIKELY to be actual flight confirmations based on:
- Subject mentions specific airlines, flights, or booking references
- Snippet contains flight numbers, airport codes, or travel dates
- Sender appears to be an airline or booking service
Exclude emails that are clearly:
- Newsletters or promotional emails
- Price alerts or fare tracking
- Hotel-only or car rental confirmations
Write an array of email IDs for the likely matches to Likely Flight Email IDs
Format: { "ids": ["id1", "id2", ...] }
Be inclusive - it's better to fetch a few false positives than miss real flights.
Analyze the full email content to extract flight data.
For each confirmed flight, extract:
- Departure city/airport and country
- Arrival city/airport and country
- Departure date and time
- Arrival date/time (if available)
- Airline and flight number (if available)
- Confirmation/PNR code (if available)
Filter out any remaining false positives now that you have full content.
Write extracted flights to Extracted Flight Records as a JSON array
sorted by departure date. Use the format specified in the extraction guide.
To run this task you must have the following required information:
> Optional: number of years to search back (default 3), specific date range
If you don't have all of this information, exit here and respond asking for any extra information you require, and instructions to run this task again with ALL required information.
---
You MUST use a todo list to complete these steps in order. Never move on to one step if you haven't completed the previous step. If you have multiple CONSECUTIVE read steps in a row, read them all at once (in parallel). Otherwise, do not read a file until you reach that step.
Add all steps to your todo list now and begin executing.
## Steps
1. [Gather Arguments: List Flight Email Candidates] The next step has the following requirements for arguments, do not proceed until you have all the required information:
- `yearsBack` (default: "3"): 3
- `outputPath` (default: "session/flight-candidates.json"): session/flight-candidates.json
2. [Run Code: List Flight Email Candidates]: Call `run_script` with:
```json
{
"file": {
"path": https://sk.ills.app/code/gmail.flights.list/preview,
"args": [
"yearsBack",
"outputPath"
]
},
"packages": null
}
```
3. [Read Flight Email Candidates]: Read the file at `session/flight-candidates.json` into context (Lightweight list - just subjects and snippets)
4. Review the email candidates (subjects and snippets only).
Identify which emails are LIKELY to be actual flight confirmations based on:
- Subject mentions specific airlines, flights, or booking references
- Snippet contains flight numbers, airport codes, or travel dates
- Sender appears to be an airline or booking service
Exclude emails that are clearly:
- Newsletters or promotional emails
- Price alerts or fare tracking
- Hotel-only or car rental confirmations
Write an array of email IDs for the likely matches to `session/flight-likely-ids.json`
Format: { "ids": ["id1", "id2", ...] }
Be inclusive - it's better to fetch a few false positives than miss real flights.
5. [Gather Arguments: Fetch Flight Email Details] The next step has the following requirements for arguments, do not proceed until you have all the required information:
- `idsPath`: session/flight-likely-ids.json
- `outputPath` (default: "session/flight-emails.json"): session/flight-emails.json
6. [Run Code: Fetch Flight Email Details]: Call `run_script` with:
```json
{
"file": {
"path": https://sk.ills.app/code/gmail.flights.fetch/preview,
"args": [
"idsPath",
"outputPath"
]
},
"packages": null
}
```
7. [Read Flight Data Extraction Guide]: Read the documentation in: `skills/sauna/[skill_id]/references/flight.extraction.guide.md` (Guide for extracting flight details)
8. [Read Flight Email Details]: Read the file at `session/flight-emails.json` into context (Full details for likely matches only)
9. Analyze the full email content to extract flight data.
For each confirmed flight, extract:
- Departure city/airport and country
- Arrival city/airport and country
- Departure date and time
- Arrival date/time (if available)
- Airline and flight number (if available)
- Confirmation/PNR code (if available)
Filter out any remaining false positives now that you have full content.
Write extracted flights to `session/flight-records.json` as a JSON array
sorted by departure date. Use the format specified in the extraction guide.