Web scraping is the process of collecting structured web data in an automated fashion. It’s also called web data extraction. Some of the main use cases of web scraping include price monitoring, price intelligence, news monitoring, lead generation, and market research among many others. Web scraping a web page involves fetching it and extracting from it. Fetching is the downloading of a page (which a browser does when a user views a page). Therefore, web crawling is the main component of web scraping, to fetch pages for later processing. Once fetched, then extraction can take place. The content of a page may be parsed, searched, reformatted, its data copied into a spreadsheet, and so on. Web scrapers typically take something out of a page, to make use of it for another purpose somewhere else. An example would be to find and copy names and phone numbers, or companies and their URLs, to a list (contact scraping).
Before getting into the code, Let’s briefly describe the scraping strategy:
- Insert into a CSV file the exact routes and dates you want to scrape. One can insert as many routes as you want but it’s important to use these columns’ names. the scraper works only for Roundtrips.
- Run the full code.
- The output for each flight is a CSV file. Its file name will be the date and time that the scraping was performed.
- All flights of the same route will automatically be located by the scraper in the appropriate folder (the name of the route).