top of page

Web Scraping with AI


Code and Workbook on Github


Description

This is for educational purposes only... ... ahem...


This class will show you how to scrape the web for information that your AI system can use to provide responses. We will scrape websites, RSS Feeds, PDF files and YouTube videos.


The trick to scraping the web for your AI projects is to get text that is in a usable format, and remove unnecessary text so that you do not over pay for tokens. We will be using well known Python modules to get text from websites, PDF's and YouTube videos so that we can then do something based off of that information.


This class will demonstrate beyond a shadow of a doubt why AI will "kill" the internet...


This class will go over:

  • The concept of scraping documents

  • BeautifulSoup for web page scraping

  • FeedParser for RSS feed scraping

  • Scraping PDF's

  • Scraping YouTube Videos

  • What to do with the data once you have it

  • How to build a full auto blog

  • Legal and ethical considerations


We will explain how these services work. Demonstrate how to use these services in live code, and you will have time to create simple labs to get the feel for how easy AI can be.




Comments


bottom of page