Access secret data points from SaaS websites and much more.
→ 6 Figure Scraping Guide ←
Use this data for your own startup or sell it to those that need it.
TL:DR – Get to the code
So any British expat / foreigner, like me, that lives in Spain knows how incredibly difficult it is to book an appointment and get your paperwork in order.
There’s always something.
You get your ID number, then you have to get your social security number, then you have to register where you live (padron), using these numbers and a rental contract, but that only lasts for a few months before renewing it.
NOW because of Brexit, I had to go and get my fingerprints done and register as a long term resident before my European status gets nullified!
Failure to acquire this ID card update will have some pretty annoying implications with my ability to continue working in this country and build a life here!
Enter the infamous Aluche office, where expats spend hours / days / weeks trying to find an appointment, the office that’s way out of town and takes ages to arrive to, the office that sends you home crying unless you have 17 photocopies of each document.
I’m also not joking about old housemates coming home crying after the Aluche trip.
If you have to do any paperwork in Spain, plan to book half a day to a full day off work.
I was lucky enough to waste a complete afternoon in summer looking for appointments to get my fingerprints done. I even took my native Girlfriend with me as backup for the problems they were bound to give me (as always)…..Done, all done.
“Come back 1 month from now and collect your new ID card”
Perfect, I couldn’t believe how flawlessly easy it was to get this done. We were both surprised! All I had to do was swing by in a month and pick up my shiny new ID.
So obviously my brain only has enough space to hold vitally important information and anything else just falls out unless there are 3 alarms set to remind me.
My Girlfriend asks
“When was your appointment to pick up your card……?”
We both looked at each other knowing that I had royally messed up.
I was 1 month late to pick up my ID and the cut-off date for British expats was in 1 month.
I called the infamous office to see if I could just swing by and pick up my card..
“NO! You book an appointment. If there are no appointments, you try again the next day”
So as I mentioned earlier “hours / days / weeks trying to find an appointment” and my European status on the brink of death. Not a good place to be.
The long and short of it – I have just started a new job, so I’m not in a position to waste hours refreshing the government page, hoping I can snatch someone’s cancelled appointment.
SO – I built a small script that automated the appointment hunting for me – If there was anything available, my Mac would start shouting at me from the other room where I could run over, check the available appointments and book it.
I was able to comfortably get on with work, knowing that at some point, my mac will start shouting in it’s beautiful robotic voice and everything would be ok.
For this I decided to use Selenium, a Python module that is used to run automated tests on websites.
You can set it up to open pages, click around, fill in forms, wait around and check if certain things appear on the page or not.
I had just spent 13 hours on a code test doing just that, so I wasn’t going to pass up this opportunity.
I learned a few things about using Selenium effectively. One thing was importing the Expected Conditions. It allows you to automatically wait until a specific element appears on the page.
If the website in question is slow to load, your program won’t crash because it can’t find the button it’s looking for.
You also don’t need to program in wait times of 3 seconds for example, while the page loads.
I also pulled in the WAIT module, because I didn’t want to rush through the site like a bot would,
“let’s fake being a human and wait for 4 seconds right here”
I wrote all of the automation code beforehand and then stuck it in a while loop.
If there are no appointments, “lets_go” will always be
True, otherwise, if we do find an appointment, we’ll cut the program and change “lets_go” to False.
We .get() the website in question and get this party started.
It’s crude, but I was in a race against time.
Here’s an example of clicking around a website.
In most cases, you can use the page inspector to find the element you want to click on, then copy it’s XPATH.
Pairing this XPATH with Expected Conditions ie: EC, waiting a max of 10 seconds until the element’s presence is located, we have automatically told python to wait until it can find the button in question.
Once it’s found, save it to a variable and then we can do bad things to it.
.click() is one of those bad things.
.send_keys() is another (this inputs text into a form field)
You’ll notice that I’m pretending to be Human by waiting 2 seconds after filling in a form field. It’s easier to pretend to be human in the real world – in Python it just feels forced.
So when we get all the way through the form and realise there are no appointments, we are told that there will be more at some point in the future, but who knows when.
That’s when we start the process over and hope that in the last 5 minutes we’ve wasted, something has become available – so we try over and over.
You know, because….Brexit in 25 days.
Now for a crude, improperly written test.
At this point when most people would be in a panic and feel deflated, I just ask Python to rinse and repeat.
So the typical scenario is that we get a message (in Spanish obviously) every time there isn’t an appointment – If that happens
_ Close the browser
_ Wait for 2100 seconds (35mins) to try again and again and again and again.
Why did I choose 35 mins? In order not to repeatedly hit the site, possibly triggering some kind of bot blocker and overall just being a responsible internet citizen by not overloading other people’s servers with automated requests.
To be honest, real humans probably hit that site more often than every 35 mins to secure an appointment and I probably would too – manually, but yea – responsible automation.
However, if the no appointments turn out to be false and that message is not displayed on the page, it means there ARE appointments. In that case
_ Tell the Mac to repeatedly shout at max volume “Appointments Available“
_ change lets_go to false and exit the program.
In total, this program was started at 7am and ran through the site every 35 mins until 3pm, continuously looking for open appointments.
I would have loved to have automatically booked on at a time that suited me, but without access to the booking pages, I wasn’t able to find the “codes” to target and automate, so this was a best-case scenario.