Venue Image Enrichment Testing: A Comprehensive Guide

Alex Johnson
-
Venue Image Enrichment Testing: A Comprehensive Guide

Introduction: Ensuring a Flawless Launch

Before launching our venue image enrichment system into the wild, it's crucial to put it through a rigorous pre-production testing phase. This process is designed to iron out any potential wrinkles, guaranteeing that our system not only functions correctly but also maintains data integrity and behaves predictably under various conditions. This comprehensive testing plan will cover a wide spectrum of scenarios, from straightforward successes to intricate error handling and cooldown logic, ultimately ensuring a seamless user experience and the efficient use of our resources.

Why Pre-Production Testing Matters

Pre-production testing is more than just a formality; it's a critical step in the software development lifecycle. It allows us to identify and rectify any issues before they impact our users or, worse, lead to costly errors in production. The primary goals of this testing phase are as follows:

  • Image Integrity: Confirming images are saved and displayed accurately.
  • Cooldown Efficiency: Preventing redundant API calls through the cooldown logic.
  • Error Resilience: Ensuring error handling does not mistakenly trigger cooldowns.
  • Mode Consistency: Verifying the proper function of both development and production modes.
  • Data Reliability: Maintaining data integrity across all test cases.

A. Success Scenarios: Celebrating the Wins

Let's begin by focusing on the scenarios where everything goes as planned. These tests ensure our system behaves optimally under ideal conditions. Each test will follow a standard structure: setup, execution, and verification.

A1. Venues with Images Found: The Core Functionality

Core testing for image retrieval. This is the bread and butter of our system.

  • Setup: We start by selecting venues known to have images on Google Places. These are our baseline tests to ensure image retrieval works as expected.

  • Execute: We'll run the image enrichment process on a batch of 5-10 venues. This allows us to check for any batch processing-related issues.

  • Verify: Here’s what we expect to see after the enrichment process is complete:

    • Images are saved in the venue_images table with the correct structure.
    • The upload_status is set to "skipped_dev" in the development environment and "uploaded" in production.
    • The enrichment_metadata.last_enriched_at field is correctly populated, noting the last time enrichment was performed.
    • enrichment_metadata.last_attempt_result is set to "success."
    • The enrichment history UI correctly displays a "Success" status.
    • The details panel displays the images, using provider URLs in the development environment and ImageKit URLs in production.
    • Re-running the enrichment process respects the 90-day cooldown period. In other words, if enrichment was successful less than 90 days ago, it shouldn't re-enrich.

SQL Verification:

We'll use SQL queries to double-check the database state, verifying the number of images and the last enrichment timestamp:

SELECT id, name, 
       jsonb_array_length(venue_images) as image_count,
       image_enrichment_metadata->'last_enriched_at' as last_enriched,
       image_enrichment_metadata->'last_attempt_result' as result
FROM venues 
WHERE id IN (...test venue ids...);

A2. Venues with No Images: Handling Empty Results

Testing scenarios where no images are found. Not all venues will have images. This test confirms that the system handles these cases gracefully.

  • Setup: We'll use venues that are unlikely to have Google Places images—perhaps new or very small venues.

  • Execute: We run the enrichment process.

  • Verify: What we should observe:

    • The venue_images field remains an empty array or null.
    • enrichment_metadata.last_enriched_at is populated.
    • enrichment_metadata.last_attempt_result is set to "no_images."
    • The enrichment history UI shows a "No Images" status.
    • Re-running the enrichment process respects a 7-day cooldown (rather than the 90-day cooldown for successful image retrieval).

Cooldown Test:

We'll use an Elixir IEx console to test the cooldown logic:

# In IEx console
venue = Repo.get!(Venue, [venue_id])
Orchestrator.needs_enrichment?(venue, false)
# Should return false if within 7 days

A3. Partial Success (Mixed Results): Handling Varied Outcomes

Testing in scenarios where some venues succeed while others fail. The real world is messy, and our system needs to cope with mixed results.

  • Setup: We'll run enrichment on multiple venues with a mix of outcomes—some with images, some without.

  • Execute: We execute a batch enrichment job with a variety of venue statuses.

  • Verify: What we must ensure:

    • Each venue has the correct individual status.
    • The batch job summary accurately reflects the counts (successes, failures, no images).
    • There is no cross-contamination of results between venues; one venue's issues shouldn't affect another.

B. Error Scenarios: Preparing for the Unexpected

Testing different failure scenarios. Now, let's turn our attention to the situations where things don’t go as planned. It's crucial that our system handles errors gracefully without causing unintended consequences like excessive API calls.

B1. Invalid API Key: Protecting Against Misconfiguration

Testing for cases where the API key is not correct. This is a critical scenario to prevent accidental API overuse.

  • Setup: We set an invalid Google Places API key in the environment variables.

  • Execute: We run the enrichment process.

  • Verify: We expect:

    • enrichment_metadata.last_attempt_result to be set to "error" (NOT "no_images").
    • enrichment_metadata.error_details to contain the API error message.
    • enrichment_metadata.last_enriched_at is not updated (or is explicitly marked as a failed attempt).
    • Re-running enrichment does NOT respect the cooldown; it can retry immediately.
    • The enrichment history displays an error status with detailed information.

Critical Point: API errors must not trigger a cooldown. Only legitimate "no images" responses should. This ensures we don’t get stuck in a loop of failed enrichment attempts.

B2. Rate Limit Exceeded: Managing API Usage

Testing the rate limit handling. Google Places APIs have rate limits. We need to ensure that the system handles these gracefully.

  • Setup: We trigger the rate limit by making rapid requests.

  • Execute: We run enrichment on many venues quickly to simulate hitting the rate limit.

  • Verify: We should see:

    • The rate limit error captured in the metadata.
    • last_attempt_result set to "error."
    • The system can retry after the rate limit resets.
    • No cooldown is applied. We want to retry as soon as the limit is lifted.

B3. Network Timeout: Handling Connectivity Issues

Testing the network timeout scenarios. Network issues are inevitable. Our system must handle them robustly.

  • Setup: We simulate network issues (if possible) or observe natural timeouts.

  • Execute: We run the enrichment process.

  • Verify: We expect:

    • The timeout is captured as an error, not "no_images."
    • The system can retry immediately.
    • The Oban job is marked as failed/retryable, so it can be automatically retried.

B4. Invalid Place ID: Dealing with Bad Data

Testing with incorrect place IDs. It’s possible to encounter invalid Google Place IDs. The system needs to handle these scenarios without crashing.

  • Setup: We use a venue with an invalid or missing google_place_id.

  • Execute: We run the enrichment process.

  • Verify: We want to see:

    • The error captured appropriately.
    • The enrichment job does not crash.
    • Other venues in the batch are still processed without interruption.

B5. Malformed API Response: Handling Unexpected Data

Testing against unexpected API responses. APIs can sometimes return data in unexpected formats. Our system needs to be prepared for this.

  • Setup: If possible, we simulate the API returning an unexpected format.

  • Verify: We verify:

    • Graceful error handling.
    • Error details are logged for debugging.
    • The system can retry.

C. Cooldown Scenarios: Fine-Tuning the Timing

Testing the cooldown logic. Cooldowns are in place to prevent unnecessary API calls. These tests confirm the logic functions precisely as intended.

C1. Recently Enriched with Images (Within 90 Days): The 90-Day Rule

Testing the 90-day cooldown. This prevents re-enriching venues that already have images within the last 90 days.

  • Setup: A venue was enriched today with images.
  • Test: We run the enrichment process again.
  • Verify:
venue = Repo.get!(Venue, [id])
Orchestrator.needs_enrichment?(venue, false)
# Should return false

C2. Recently Enriched with No Images (Within 7 Days): The 7-Day Rule

Testing the 7-day cooldown. This prevents re-enriching venues where no images were found within the last 7 days.

  • Setup: A venue was enriched today, and no images were found.
  • Test: We run the enrichment process again.
  • Verify:
venue = Repo.get!(Venue, [id])
Orchestrator.needs_enrichment?(venue, false)
# Should return false

C3. Stale with Images (>90 Days): Re-Enriching Old Data

Testing the 90-day refresh. Ensure venues with images older than 90 days are re-enriched.

  • Setup: We manually set the last_enriched_at to 91 days ago for a venue that has images.
UPDATE venues 
SET image_enrichment_metadata = jsonb_set(
  COALESCE(image_enrichment_metadata, '{}'::jsonb),
  '{last_enriched_at}',
  to_jsonb((NOW() - interval '91 days')::text)
)
WHERE id = [test_venue_id];
  • Test: We run the enrichment process.
  • Verify: The venue IS re-enriched. The cooldown should have expired.

C4. No Images Cooldown Expired (>7 Days): Refreshing Empty Results

Testing the 7-day refresh. Ensure venues where no images were found more than 7 days ago are re-enriched.

  • Setup: We set last_enriched_at to 8 days ago for a venue where no images were found previously.
UPDATE venues 
SET image_enrichment_metadata = jsonb_set(
  COALESCE(image_enrichment_metadata, '{}'::jsonb),
  '{last_enriched_at}',
  to_jsonb((NOW() - interval '8 days')::text)
)
WHERE id = [test_venue_id];
  • Test: We run the enrichment process.
  • Verify: The venue IS re-enriched. The cooldown should have expired.

D. Data Integrity: Ensuring Data Quality

Testing for data quality. Data integrity is paramount. These tests will verify that our data is accurate, consistent, and well-structured.

D1. Venue Images Structure: Validating Image Data

Testing the structure of each image. We need to make sure the data associated with each image is in the right format and contains the necessary information.

  • Verify each image in venue_images has:

    • provider: "google_places"
    • provider_url: A valid URL
    • upload_status: "uploaded" | "skipped_dev" | "failed"
    • imagekit_url: Present in production, null in dev.
    • imagekit_file_id: Present in production, null in dev.
    • imagekit_thumbnail_url: Present in production, null in dev.
    • enriched_at: An ISO8601 timestamp.
    • metadata: An object containing Google Places attribution details.

D2. Enrichment Metadata Structure: Validating Metadata Consistency

Testing the enrichment metadata structure. The metadata contains vital information about the enrichment process.

  • Verify image_enrichment_metadata contains:

    • schema_version: "1.0"
    • scoring_version: "1.0"
    • last_enriched_at: An ISO8601 timestamp.
    • completeness_score: A number between 0 and 1.
    • next_enrichment_due: An ISO8601 timestamp.
    • providers_used: An array containing "google_places"
    • total_images_fetched: A number.
    • enrichment_history: An array (maximum 10 entries).
    • last_attempt_result: "success" | "no_images" | "error"
    • last_attempt_details: An object.

D3. Oban Job Metadata: Validating Job Metrics

Testing Oban job metadata. Oban is our job processing system. This tests ensures that job metadata accurately reflects the outcome of the enrichment process.

  • Verify the Oban job meta field contains:

    • status: Matches the actual outcome (e.g., success, error).
    • images_discovered: The count of all images found.
    • images_uploaded: The count of successfully processed images (uploaded OR skipped_dev).
    • images_failed: The count of failed uploads.
    • summary: A human-readable description of the outcome.
    • providers_succeeded: An array of providers that succeeded.
    • providers_failed: An array of providers that failed.

D4. Enrichment History UI Display: Validating UI Accuracy

Testing the display of the enrichment history. The UI needs to accurately reflect the enrichment history.

  • For each job in history:

    • The status badge matches the meta.status.
    • Image counts are displayed correctly.
    • The details panel shows images when the status is success.
    • Development mode displays a "(Development Mode - Source Images)" label.
    • Provider URLs are used in development, and ImageKit URLs are used in production.

D5. Development vs Production Mode: Validating Environment-Specific Behavior

Testing the development and production environment Our system should behave differently in development and production environments, to ensure cost-effectiveness and appropriate functionality.

  • Development:

    • ImageKit upload is disabled.
    • A maximum of 2 images per provider (configurable).
    • upload_status is "skipped_dev."
    • Images are displayed from the provider_url.
  • Production:

    • ImageKit upload is enabled.
    • A maximum of 10 images per provider (configurable).
    • upload_status is "uploaded."
    • Images are displayed from the imagekit_url.

D6. No Data Corruption: Preventing Data Issues

Testing to ensure data integrity over time. We want to ensure that our system doesn't introduce data corruption.

  • After multiple enrichment runs:

    • There are no duplicate images in venue_images.
    • The enrichment_history is limited to 10 entries.
    • Old entries are properly rotated out (removed).
    • Timestamps are in the correct format.

E. Edge Cases: Handling the Uncommon

Testing edge cases. Edge cases are those unusual scenarios that can reveal hidden bugs. These tests ensure the system handles unexpected situations gracefully.

E1. Venue with No Coordinates: Handling Missing Location Data

Testing when location data is missing. Some venues might lack precise location information.

  • Setup: A venue is missing latitude/longitude coordinates.
  • Execute: We run the enrichment process.
  • Verify: Graceful handling and an appropriate error message are displayed.

E2. Venue with No Google Place ID: Handling Missing IDs

Testing when the Google Place ID is missing. Some venues may not have a Google Place ID.

  • Setup: A venue is missing a google_place_id.
  • Execute: We run the enrichment process.
  • Verify: The venue is either skipped, or a fallback mechanism (like geocoding) is used.

E3. Manual Data Cleanup: Reacting to Changes

Testing how it reacts to data modifications. What happens if someone manually deletes an image?

  • Setup: We manually delete images from a venue.
  • Execute: We run the enrichment process again (if the cooldown has expired).
  • Verify: New images are fetched and saved.

Test Execution Plan: The Roadmap to Success

Putting the plan into action. Here's how we'll approach this testing phase systematically.

  1. Setup: Create a fresh test database or use a development environment with enrichment data cleared to prevent interference from previous tests.
  2. Execute: Systematically run each test category as outlined above.
  3. Document: Record the results for each test scenario, including any issues encountered and their resolution.
  4. Fix: Address any failures or discrepancies before deploying to production.
  5. Verify: Retest any failed scenarios after implementing fixes to ensure the issues are resolved.

Success Criteria: Defining Victory

Determining the success of the tests. Our system is ready for production if all test scenarios pass with the following:

  • ✅ Correct data is saved to the database.
  • ✅ Correct cooldown behavior is observed.
  • ✅ Proper error handling is in place.
  • ✅ Accurate UI display.
  • ✅ No data corruption.

Notes: Additional Information

Important notes to keep in mind.

  • Use the /tmp/test_cooldown.exs script for cooldown testing.
  • Check both database state AND the UI display for each test.
  • Verify that the Oban job metadata matches the venue enrichment metadata.
  • Test thoroughly in development mode before deploying to production.

In summary, pre-production testing is essential for ensuring that our venue image enrichment system functions reliably, efficiently, and without data corruption. By systematically testing all the scenarios detailed above, we will deliver a high-quality product that meets our user's expectations.

For more information on the Google Places API and best practices, check out the official Google Places API documentation.

You may also like