Exploring Backend Web Scraping with UiPath: Extracting Hidden Data from Websites
In this article, we scrape the backend data by using Uipath Studio. This article explores the fascinating realm of backend web scraping using UiPath in simple terms. We’ll discover how UiPath can help us extract data from the hidden parts of websites. we’ll cover the basics of backend web scraping, its advantages compared to front-end scraping, and provide practical examples of implementing backend scraping using UiPath.
Problem Statement:
Suppose you have a product webpage (https://appsumo.com/products/caisy/) where important information, such as the launch date of a product, is not directly displayed on the page. Instead, this information is stored in the backend of the website. Your goal is to extract the launch date of the product by scraping the backend data.
Web scraping typically involves extracting information from the visible parts of a website, but in this case, the launch date is hidden in the backend, making it inaccessible through traditional scraping methods. To overcome this challenge, you need to employ backend web scraping techniques to retrieve the desired information.
Here I have used View Page Source Option from the chrome browser ( Ctrl + U )
In this scenario, you’ll be using UiPath, a powerful Robotic Process Automation (RPA) tool, to perform backend web scraping. UiPath enables you to interact with the server-side sources of the website, accessing the hidden data stored in the backend. I hope this provides a clearer explanation of the problem statement.
Solution:
Step 1:
- Open the browser by using the open browser activity
enter the URL of a webpage in this activity
Step 2:
drag and drop the getattribute activity in the body of open browser activity
- Indicate element inside browser: click the element indicate the required element in this webpage.
- Attribute: which attribute you want get the data for example in this problem we use the “inner HTML” attribute.
- Save to: store the obtains data one variable for example we use in the problem “text” variable
Step 3:
drag and drop the Find Matching Patterns activity in this body
configure regular expression….
using this configuration enter the valid regular expression and which test we check the pattern those are enter the properties.
Properties :
- Pattern: enter the regular expression valid to find the text for example find the date use this expression{(“uploadDate”: “([^”]+)T”)
- Text to search in : this pattern search the text data if find the store the result
- result: if any pattern find store the new variable . This variable store date IEnumerable data type we can access data by using
Step 4:
in this step we can access the data by using the for each loop and store the values by using assign activity
in this way to access the hidden data you want display the message box its shows the data .