I work primarily on scheduling problems which are NP-Hard. As the size of the input increases the time to find a optimal or even a feasible solution increases. My workstation is remarkably under-powered to test out larger instances of the job shop scheduling problems. So I am going test the input data out on AWS free-tier. Hopefully I can convince my bosses that powerful workstations have a good return on investment!

Before deploying anything on AWS, I googled “google or tools on aws”. This popped up only two different pages but both were related to AWS Lambda. I don’t know anything about AWS Lambda and am going to stick to EC2 instances which I have used in the past.

Launching an instance on AWS EC2

OR-Tools has been tested on Ubuntu 18.04 LTS and up (64-bit) so we I going to to launch a Ubuntu Server 20.04 LTS (HVM).

Step 1: Choose an Amazon Machine Image (AMI)

Ubuntu Server 20.04 LTS (HVM), SSD Volume Type

Step 2: Choose an Instance Type

t2.micro is eligible for free tier
The other steps I am leaving to default. One thing to keep in my mind is that to SSH into an instance we need the security key. If you have an existing key you can use that else you need to create a new security key. It should be private.

Connecting to an AWS EC2

There are two options to connect to a EC2 instance from a Windows machine: puTTY and Powershell. I am following the tutorial on cloudlinuxtech to connect to the EC2 instance.
First the .pem key has to be converted to a .ppk key that putty understands. This is done using the putty key generator. Then you can connect to the instance using the ppk key. The username is my-instance-user-name@my-instance-public-dns-name.
One mistake I made is that I configured the security key so that traffic from a specific IP address could connect to the instance. This can be changed by allowing all IP adress in the scurity group settings.

Dr. Debugging or: How I Learned to Stop Worrying and Love the Bugs

The first error I got when trying to run the script was that the model was invalid. ortools version - 9.2.9972
Solve status: MODEL_INVALID
After some googling I found a issue on the ortools github page where the model was invalid on windows. It appears that there is some issue with the format of integers. There is piece of code in my script where I am subtracting two datetime columns in pandas and I suspect where Int32 is being introduced

The timdelta thing waa not making any sense so I went back to the Jupyter notebook where everything worked. Funny thing, the two dataframes were created from an Excel sheet. To run the thing on AWS I converted them to different CSV files. This seems to be the root cause of the error. So I checked the actual pd.read code for both the files. In the CSV version I was not parsing dates when I read them into dataframes whereas for the Excel version I was parsing dates.

Performance

I checked that the code worked on the t2.micro instance. Once I was satisfied that the code worked as intended, I changed the instance type to t2.xlarge with 8 vCPUs