π GenerateScraperNode Module
The GenerateScraperNode
module implements a node responsible for generating a Python script for scraping a website using the specified library. It takes the user's prompt and the scraped content as input and generates a Python script that extracts the information requested by the user.
Classesβ
GenerateScraperNode
β
GenerateScraperNode
is a node responsible for generating a Python script for scraping a website using the specified library.
Attributesβ
- llm_model: An instance of a language model client, configured for generating answers.
- library (str): The Python library to use for scraping the website.
- source (str): The website to scrape.
- verbose (bool): A flag indicating whether to show print statements during execution.
Methodsβ
-
__init__(self, input: str, output: List[str], library: str, website: str, node_config: Optional[dict] = None, node_name: str = "GenerateScraper")
- Initializes the
GenerateScraperNode
with a language model client, a library, a website, and a node name. - Args:
input (str)
: Boolean expression defining the input keys needed from the state.output (List[str])
: List of output keys to be updated in the state.library (str)
: The Python library to use for scraping the website.website (str)
: The website to scrape.node_config (dict, optional)
: Additional configuration for the node.node_name (str, optional)
: The unique identifier name for the node. Defaults to "GenerateScraper".
- Initializes the
-
execute(self, state: dict) -> dict
- Generates a Python script for scraping a website using the specified library.
- Args:
state (dict)
: The current state of the graph. The input keys will be used to fetch the correct data from the state.
- Returns:
dict
: The updated state with the output key containing the generated answer.
- Raises:
KeyError
: If input keys are not found in the state, indicating that the necessary information for generating an answer is missing.
Example Usageβ
Here is an example of how to use the GenerateScraperNode
class:
from generate_scraper_node import GenerateScraperNode
# Define a generate scraper node
generate_scraper_node = GenerateScraperNode(
input="user_input & document",
output=["generated_script"],
library="requests",
website="https://example.com"
)
# Define the state
state = {
"user_input": "What information do you want to scrape from the website?",
"document": [document]
}
# Execute the generate scraper node
state = generate_scraper_node.execute(state)
# Retrieve the generated script from the state
generated_script = state["generated_script"]
print(f"Generated Script:\n{generated_script}")