Additionally, you will need to monitor the health of the instance and maybe run manual updates. And don’t forget the security (setting up a VPC, route tables, etc.).
This option is similar to an on-premises solution giving you full control of the instance, but you would need to manually spin an instance, install your environment, set up a scheduler to execute your script at a specific time, and keep it on for 24×7. However, it definitely does not resemble any serverless architecture, so let’s consider it as a reference point or a baseline. The first option, an instance of a virtual machine in AWS (called Amazon Elastic Cloud Compute or EC2), is the most primitive one. *The AWS Lambda free usage tier includes 1M free requests per month and 400,000 GB-seconds of compute time per month. Let’s start with the pricing of three cloud-based scenarios and go into details below. At first glance, the former option may feel more appealing - you have the infrastructure available free of charge, why not to use it? The main concern of an on-premises hosted solution is the reliability - can you assure its availability in case of a power outage or a hardware or network failure? Additionally, does your local infrastructure support continuous integration and continuous deployment (CI/CD) tools to eliminate any manual intervention? With these two constraints in mind, I will continue the analysis of the solutions in the cloud rather than on-premises. We have at least two options to consider: on-premises (such as on your local machine, a Raspberry Pi server at home, a virtual machine in a data center, and so on) or you can deploy it to the cloud. Subsequently, we need an environment to execute the script. The project can be considered as a standard extract, transform, load process without a user interface and can be packed into a self-containing function or a library. This is an important consideration, which we will come back to later. The execution of the script takes less than 15 minutes. The use case is fairly simple: at certain times during the day, I want to run a Python script and scrape a website. I would like to clear the air around the issue of effectiveness through an analysis of a web scraping solution. If you are interested in serverless architecture, you may have read many contradictory articles and wonder if serverless architectures are cost effective or expensive.