Automated Taiwan Lottery & Sports Betting Tracker
Introduction to the Lottery Tracker
Welcome! This article delves into the fascinating world of tracking Taiwan's lottery and sports betting locations using a Python script. This project, embodied in a .ipynb file (a Jupyter Notebook), automates the process of gathering and monitoring data from both the Taiwan Lottery and Taiwan Sports Lottery websites. It's a useful tool for anyone interested in knowing where these betting locations are, and especially for monitoring changes such as new openings, relocations, or closures. The script uses various Python libraries like requests for web scraping, BeautifulSoup for parsing HTML content, pandas for data manipulation and storage, and openpyxl for writing to Excel files. This project is designed to be easily adaptable and expandable, providing a robust framework for tracking changes in real-time. By automating the data collection and comparison process, this tool significantly reduces the manual effort required to keep track of these locations. This system also incorporates error handling and data validation to ensure the reliability of the information gathered. The use of a structured approach, like the .ipynb file, enables users to replicate the process, examine the code, and adjust parameters to meet specific needs. This makes it an invaluable resource for data enthusiasts and analysts who want to explore this sector systematically.
Core Functionalities of the Script
This script primarily focuses on automating the tracking of lottery and sports betting locations in Taiwan. It scrapes data from the Taiwan Lottery and Taiwan Sports Lottery websites, organizes the gathered data, compares it against previous data, and then generates reports. One of its key functions includes setting up the environment. This is achieved by importing necessary libraries like requests, BeautifulSoup, pandas, datetime, time, urllib.parse, os, re, openpyxl, and pytz. These libraries facilitate web scraping, data handling, and report generation. The script then sets up crucial parameters such as the time zone, current date, and file paths. These settings are essential for proper operation and data storage. The script uses the requests library to fetch the content from the websites of the Taiwan Lottery and Taiwan Sports Lottery. It utilizes BeautifulSoup to parse the HTML and extract relevant data, like addresses and shop names. A key part of the script is the comparison of current data with historical data. It achieves this by utilizing a mark_status function which compares the current dataset with the previous one. This comparison identifies changes, such as new, relocated, and closed locations. These changes are crucial for the user to understand the dynamic nature of these businesses. After data extraction and comparison, the script generates comprehensive reports using the generate_report function. These reports summarize the changes and highlight specific updates, making it easy for users to review the data.
Setting Up the Environment
Setting the stage for our lottery tracking system is critical. The script begins by importing all essential Python libraries. It uses the requests library to fetch data from the internet, the BeautifulSoup library to parse the HTML content, pandas for handling the data in a structured format, and openpyxl to save the processed information into Excel files. The script also includes libraries such as datetime and pytz for handling time-related functions, urllib.parse for URL manipulation, os for interacting with the operating system, re for regular expression operations, and google.colab.drive for accessing Google Drive to store the data and reports. The use of these libraries ensures the script can fetch data, parse it, and then write the results into a readable format. A key aspect of the setup is the configuration of time zones. The script sets the time zone to