Automating web-based tasks with Selenium? Efficiently. That's the name of the game here, so.. Take the reins and make technology work for you.
A coding story in three chapters (with a bonus).
About the Project
There I was, sitting on a chair, having to gather all these newspaper translations, from who knows where, in who knows what languages. Let me tell you, it was a real slog. The c h a r a c t e r l i m i t, just wouldn't let me get the full show. Nothing but snippets, snippets, snippets...
Then, the idea struck. This script, see, it's gonna let you bypass all that. No more copying and pasting. Just grab a coffee, make it strong, black and wait for the whole enchilada - right there in front of you. A real time-saver, I'm telling you.
The best part? Turns out, it wasn't just about solving an immediate problem: it was about deepening my know-how, getting my hands dirty with some real-world programming. A win-win, if you ask me.
Installation and Configuration
Introduction
Alright, let's crack open this code. Step by step, we're gonna peel back the layers of that machine where each piece is fitting in like a puzzle. By the time we're done, you're gonna have a whole new appreciation for Selenium automation and what it can do.
And beware, you're not just gonna be learning for learning's sake. Nah, this is about equipping you with the tools to tackle your own challenges. Whatever got you tied up in knots, this is your chance to untangle it and make it sing.
Installation
"Time to get this thing installed and ready to rock."
First things first - you're gonna need to clone or download the source code. The repository is online here. Don't worry, I'll wait.
If you don't already have Firefox on your system, you're gonna need to get that taken care of. For Ubuntu and other Debianistas folks out there, it's a simple enough fix:
sudo apt update && sudo apt install firefox
It’s also possible to use Chrome instead of Firefox.. It’s not covered here so if you really want it that way, it’s this way there.
Next, we've got some packages to install. Nothing too crazy, but we gotta make sure we've got all the tools ready to roll. So fire up that trusty old pip and run:
pip install -r requirements.txt
Boom. Done and done.
Now, the real kicker - the GeckoDriver. This is the secret sauce that sits behind the scenes, translating your commands into a language web browsers can understand.
You're gonna want to head over to the Mozilla repository, grab the latest version, and get it all set up. I'm talking extraction, permissions, symbolic links - the whole nine yards. And you know the magic? There’s a bash spell ready out there for that:
wget https://github.com/mozilla/geckodriver/releases/download/v0.35.0/geckodriver-v0.35.0-linux64.tar.gz -O /tmp/geckodriver.tar.gz \
&& sudo tar -C /opt -xzf /tmp/geckodriver.tar.gz \
&& sudo chmod 755 /opt/geckodriver \
&& sudo ln -fs /opt/geckodriver /usr/bin/geckodriver \
&& sudo ln -fs /opt/geckodriver /usr/local/bin/geckodriver
That's a mouthful, I know. But trust me, it's worth it. Once you've got all that squared away, you're gonna be almost ready to start automating like never before.
Configuration
Alright, folks, time to dive into the nitty-gritty of this configuration stuff. Because let me tell you, if you don't get this part right, the whole thing's gonna be about as useful as.. a lawn mower in the middle of the Mojave.
First up, we've got those FIREFOX_PATH
and GECKODRIVER_PATH
variables. Now, I know what you're thinking - "But how in the world am I supposed to know where these things are hiding on my system?" Well, fellas, depending on the one you’re looking for, there's a little bash command you can run to ferret that out:
which firefox
About the Gecko.. Remember that installation process we went through earlier? Well, the path you set up there is the one you'll want to use here. Not sure about it? No problem, same sauce, Just run:
which geckodriver
And Zap! Paste that info into the right place. Easy, right?
Now, the HEADLESS
parameter - this one's a bit of a wild card. See, you can run this whole thing in headless mode, which means the browser will run and do its job but won't actually show up on your screen. Kinda like a ninja, you know? But if you want to see what's going on, and watch your Firefox acting like possessed by a spirit you can set it to False and watch the magic unfold – try and see.
As for the SOURCE_LANG
and OUTPUT_LANG
, well, pretty self-explanatory. Just pick the languages you want to translate to (source) and from (output), and the script will handle the rest. If you're looking to learn a little more about the available languages, take a glance at that readme file in the repository here.
Last but not least, we've got the CHAR_LIMIT
, TIMEOUT
, and SLEEP_TIME
. These are all about fine-tuning the performance. TIMEOUT
, and SLEEP_TIME
depend mostly on the speed of your network connection. It’s about how long we oughta wait for the website to be fully loaded so we can proceed with the translation. CHAR_LIMIT
is linked to deepL’s restrictions (currently 1500 characters). Tinker with them as needed, but be careful - you don't want to break the whole machine.
The INPUT_FILE
and OUTPUT_FILE
- those are the ones you'll want to point to your own input and output files. Simple enough, right? Just make sure you're pointing the input to something that actually contains text, because that's what this script is built to handle.
Now, a quick word of warning – the script is not perfect - if the output file doesn't exist yet, the script's gonna go ahead and create it for you. But if it does exist, well, buckle up, because the script's gonna overwrite it every time you run it. So, you know, maybe keep an eye on that (and specify a different output each time) if you don’t want to see your previous achievements disappear..
Wait.. I almost forgot.. You're a WSL User? WSL now supports running Linux GUI applications in a fully integrated experience. On older configurations, you might need this if you intend to run Firefox as a marionette with its GUI.
Usage
Once you've got those parameters all squared away, fire up the terminal, go to the script's folder and just run:
python translate.py
Ka-boom, the show’s hit the road. The script's gonna take that input file, work its translation-y wonders, and spit out the results into your output file. Fear not, my friend just keep an eye on that output file, and you'll be able to see the fruits of that labor, plain as day.
What are you waiting for? Get out there, update those parameters, and let's see what kind of translation magic you can work!
If you're curious about the code you can get a quick overview in the second chapter or just dive deep into it (third chapter)!
The code is available on Github.
(Cover picture: Double Indemnity, 1944).