I want to scrape a website asynchronously using a list of tor circuits with different exit nodes and making sure each exit node only makes a request every 5 seconds.
For testing purposes, I'm using the website https://books.toscrape.com/ and I'm lowering the sleep time, number of circuits and number of pages to scrape.
I'm getting the following two errors when I use the
--tor argument. Both related to the torpy package.
'TorWebScraper' object has no attribute 'circuits'
'_GeneratorContextManager' object has no attribute 'create_stream'
Here is the relevant code causing the error:
async with aiohttp.ClientSession() as session:
for circuit in self.circuits:
async with circuit.create_stream() as stream:
async with session.get(url, proxy=stream.proxy) as response:
text = await response.text()
return url, text
Here there is more context