Folding@Home(aka FAH) is a distributed computing project. To quote from their website,
FAH is a distributed computing project for simulating protein dynamics, including the process of protein folding and the movements of proteins implicated in a variety of diseases. Folding@Home involves you donating your spare computing power by running a small client on your computer. The client then contacts the Folding@Home Work Assignment server, gets some workunits and runs them, You can choose to have it run when only when your system is idle, or have it run all the time.
While I used to run FAH long, long back - dating back to my forum days, I eventually stopped due to lack of proper computing equipment. Recent events with the COVID-19 situation and FAH's projects around it (see Coronavirus - What we're doing and COVID-19 Small Molecule Screening Simulation for details) and the relatively powerful computer I built recently meant that I could run FAH on my desktop computer.
Now, I had some extra credits for AWS that were to expire soon and I figured instead of letting them go to waste, I thought to myself maybe I could spin up some EC2 instances and run Folding@Home on them. I started looking at the pricing of the GPU instances - and they were a bit pricier than what I could sustain. Considering this, I selected the c5n.large instance as I didn't need instance and EBS-backed disks would be handy in setting up aa Auto Scaling Group.
To reduce expenses further, I started looking at Spot prices and it turned out, the spot prices were about 68% cheaper as compared to the on-demand prices. Since we don't really care about what happens when the spot termination happens and the ASG will bring the instance count back up, I went with this option.
The spot pricing trend revealed that the prices had remained stable and just to ensure the spot bids would be fulfilled, I kept the max spot price couple of cents more than the maximum price going then. Initially, the instances were brought up by manually launching them from the AWS Console. Since long I'd been meaning to use AWS CDK, this was the perfect opportunity to learn and try to use it.
The CDK code will bring up a new VPC, a couple of subnets, an ASG and attach a security group to allow for SSH into the instance. The code is not the best, there's a bunch of hard-coding of regions, AMIs, SSH key names, but pull requests to clean up and make it more generic is more than welcome! Check out the code on my GitHub Repo
Bring up a complete AWS Compute stack with VPC, EC2, and other dependencies using AWS CDK. Set up a Folding @ Home stack with couple of commands
Folding on AWS
This is a CDK project which configures a multi-instance ASG. As an example, there are two sets of configs predefined: the first config creates a two-node ASG pointed to an AMI which is pre-configured to run Folding@Home, while the second config is for a single-node ASG running a base install of Ubuntu with some extras (check packer/generic_base cvonfig for details)
The AMIs are configured and built using HashiCorp's Packer. These AMIs can then be updated in the config file.
How to run
Preparing the AMI
-
Install packer
-
Generate AWS access keys
-
Set the following variables in your shell's environment: AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
, AWS_DEFAULT_REGION
-
Change into the packer
sub directory: cd packer
-
Build the Folding at Home Amazon Machine Image that will be used to create the virtual machines
packer build -var 'fah_user=your_username' -var 'fah_passkey=your_passkey' \
-var 'fah_team=your_team_id' fah_ami.json
-
If you have the jq
program…