
This lab shows you how to turn a web page into a variable value, pull the contents of the article in the post, and then submits this content to ChatGPT to ask for a summarization.
We use Regular Expressions and the re module to pull out all text within <p> tags. We also clean up the text with Regular Expressions to delete the starting fragments from the p tag that may contain styling information.
This type of text parsing has to be done because if you feed ChatGPT the entire web page it will end up requiring 20K tokens for even a small web page due to javascript and styling information.

import requests
import re
import openai
openai.api_key = "APIKEY"
url = "https://arstechnica.com/tech-policy/2023/06/apple-says-uk-online-safety-bill-is-serious-threat-to-end-to-end-encryption/"
data = requests.get(url)
data = data.text
text = ""
reg_str = f"<p(.*?)</p>"
data = re.findall(reg_str, data)
swap = f"(.*?)>"
for x in data:
x = re.sub(swap, '', x)
text = f"{text} {x}"
print(x)
print("----------")
print(f"\n\n")
response = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "assistant", "content": text},
{"role": "user", "content": "what ais this about"}
]
)
print("AI SAYS:")
print(response["choices"][0]["message"]["content"])
Code language: PHP (php)