Files
reitti/generate-memory-blocks.md
2025-10-24 12:52:53 +02:00

3.7 KiB
Raw Blame History

Step 1: Data Pre-processing & Filtering 🧹

The goal here is to remove data points that are not actual, intentional visits.

Remove Accommodation Stays: The first step is to filter out all visits to the known accommodation. This location serves as your base reference point, not a tourist activity.

Filter by Duration: Remove very short visits. Stops under 10-15 minutes are often just traffic lights, brief errands (like an ATM), or GPS drift. Set a minimum duration threshold to focus on meaningful stays.

Consolidate Micro-Visits: If your app generates multiple separate "visits" for wandering around a single large area (e.g., a park or a market), you may need to merge these into one continuous visit before proceeding.

Step 2: Data Enrichment with Context 🗺️

Raw coordinates are not useful for a travel log. You need to understand what these places are.

Reverse Geocoding: Convert each visit's latitude and longitude into a human-readable address.

Point of Interest (POI) Matching: This is the most crucial step. Use a service like the Google Places API, Foursquare API, or OpenStreetMap to match the coordinates to a named place. This will give you a name (e.g., "Louvre Museum"), a category (e.g., "museum"), and other details.

Your data will transform from this: {lat: 48.8606, lon: 2.3376, start: '14:30', end: '17:00'}

To this: {name: 'Louvre Museum', category: 'museum', address: 'Rue de Rivoli, 75001 Paris', ...}

Step 3: Scoring & Identifying "Interesting" Visits

Now you can define what makes a visit "interesting" by calculating an interest score. This helps prioritize the highlights of the day.

Combine several factors into a weighted score:

Duration: Longer stays are generally more significant. A 3-hour museum visit is more important than a 20-minute coffee stop.

Distance from Accommodation: Visits far from where you're staying are likely planned day trips or major excursions and should be scored higher. This is a very strong signal of intent.

Place Category: This is key. Use the POI data from Step 2 to assign a weight to each category.

    High Interest: museum, landmark, park, tourist_attraction, historic_site.

    Medium Interest: restaurant, cafe, shopping_mall.

    Low Interest: grocery_store, pharmacy, gas_station.

Novelty: A place visited only once on the trip is typically more notable for a travel log than a coffee shop visited every morning.

You can create a simple scoring formula, for instance: Score=(wd⋅Duration)+(wx⋅Distance)+(wc⋅CategoryWeight)

Where wd, wx, and wc are the weights you assign to duration, distance, and category, respectively.

Step 4: Clustering & Creating a Narrative ✍️

A simple list of interesting places is good, but a great travel log groups them into a story.

Spatio-Temporal Clustering: Group visits that are close in both location and time. For example, a visit to a museum, followed by a visit to a café next door 15 minutes later, should be part of the same event.

Algorithm Choice: An algorithm like DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is excellent for this. You can define a "neighborhood" in terms of time (e.g., within 2 hours of each other) and space (e.g., within 500 meters of each other) to automatically find these groups.

Summarize the Cluster: Once you have a cluster of visits, create a single travel log entry for it.

    Title: Name the event after the highest-scoring visit within the cluster (e.g., "Visit to the Eiffel Tower and Champ de Mars").

    Timeframe: Use the start time of the first visit and the end time of the last visit in the cluster.

    Content: List the significant places visited within that cluster.