I figured out how to use Google Colab over the weekend and wanted to share what I learned. What is Colab? Colab is essentially Jupyter notepad meets Google Cloud + free GPUs for running ML queries. It also integrates with Google Drive and Github. Why is this great? Besides getting to sound hip as you tell people you program on the cloud, it means you can access and run your code anywhere, on any device.
If you’ve never used Python before this blog might be a bit much; Jupyter (and Colab) is a Python environment. In the future, I hope to write up a good list of resources for self-learning Python as I get asked this often. If you’ve coded Python but haven’t used Jupyter notebook before, I’ll cover the basics.
- Always run it in Chrome! Chrome was built for Colab. Almost all the issues I was having resolved themselves after I jumped off Safari.
- Click on the Ram/Disk thing and click connect to hosted runtime if the code seems to be very non-responsive
- Shift + Enter to run cells
You have two boxes – text and code.
If you want to make titles, sections and write explanations for stuff, then use text. Too big for a comment? Text. Want different font sizes? Text. The text cells use markdown, which is easy to learn. Type like normal and use the hashtag followed by a space for titles to get your point across (e.g. ‘# Title’). When you make titles, Colab puts them in the sidebar for easy navigation. There’s also the ability to write complex maths equations using LaTeX. To learn more here’s a guide.
Code boxes are for Python. You can run the sections in isolation and in any order you want. When you do this, variables are saved for the entire notebook in memory, so you don’t have to rerun everything if you’re going to redo a section. There’s no hand-holding so pays to be careful with overwriting variables.
When I started using Jupyter, I remember being surprised how easy it is to draw graphs and visualise data. It also prints the last variable without needing print(). The Getting Started notebook provides a good example of these features.
Coding on the cloud is no fun and no games if you can’t read and write data. Thankfully there are several options for input/ output, from ‘mounting’ the google drive to building an API interface. For general purposes, I found mounting to be the best approach.
In most file systems, you can’t have two files with the same name in the same folder. This is not the case with Google Drive. In Google Drive, everything gets assigned a unique ID which takes precedence over everything. Also, the same file can sit in multiple folders. Mounting, however, treats it like a standard file system, so you never need to worry about this if you stick to mounting.
There are two methods for mounting GDrive. Code or Interface. While I love coding, the interface makes more sense because it’s easier. The code for mounting won’t work outside of CoLab anyway. So unless you take particular pleasure in writing platform dependant code that isn’t needed, read on.
Firstly, you need to be connected to a runtime. If this doesn’t start automatically, you can run something or click ‘connect’ in the top right. After you’re connected to the runtime, you can mount the GDrive.
Click the folder icon on the left side panel and hit ‘Mount Drive’, follow the prompts and you’re done! If nothing happens, hit refresh. Currently, you’re in the ‘content’ folder, so if you click on the ‘..’ folder you can still find your Drive (see below).
Now it’s available for reference in Python. I recommend clicking the three dots and using ‘copy path’ until you get confident with the full directory. The reason being is anything that sits in your drive will need the prefix ‘/content/drive/My Drive/’. Lastly, if you want a code block for outputting to your drive, click the ‘<>’ on the left sidebar and search for ‘write’.
This also conveniently shows you the other method for mounting your google drive. Replacing f.write(…) with var = f.read() for reading and using ‘copy path’ gives you easy I/O on the cloud!
Colab also works directly through GitHub. You can save files to one of your repositories and also load any notebook on GitHub directly into Colab (without having to download, copy, etc.). There’s even a chrome extension by Google that lets you do this in a single click! You can read more about this here.
Bonus – essential settings
- Switch to dark mode. If you’re not coding in dark mode, you’re not coding :p
- Switch to Vim and learn Vim. Hint: Press ‘i’ to insert and ‘a’ to append and start editing like normal. After that, it’s easy to learn at your own pace. Don’t buy into the lies that Vim is hard; just have a cheat sheet handy and don’t stress it!
- For a bit of fun, set the Power Level to Many Power and enable Corgi Mode and Kitty Mode (spoilers in the image below). Although intended to be a joke, it actually caused my browser to use a lot of power.
I had fun learning this over the weekend and I’ll be using it as my primary Python environment for Data Science. If you have any tips or tricks for Colab, please let me know!
As always, thanks for taking the time to read my blog! If you have any comments, suggestions or want to chat, free to connect with me on my LinkedIn!
~ Ryan Edwards