DISCLAIMER: Before you begin, be aware that the protagonist of any fictional account that this write-up might contain would be a male and would be referred to as a ‘he’ or at times, simply as ‘coder’. That’s not because the author is a sexist prick (hopefully) or a lazy writer; it’s so because the writer is a male, is identified by simpler minds as a ‘coder’ and would like to write the post without having to worry about being PC.
There are two things that might be useful to know before we really dive in: who’s the nut-job who wrote this and why did he do it?
You can skip this part entirely if you want to. I won’t hate you for that. Or maybe I will. It doesn’t matter anyway, right?
In context of this post, he is someone who likes to code and sporadically write a blog or two. He believes that what he writes, code or letter, can make an impact. Who doesn’t?
Why he wrote this is more important. The immediate reason is that he recently contributed to an amazing library, felt pretty darn good about it and would like his peers to do the same more often.
A deeper reason still is his belief in the philosophy of Free/Libre and Open Source Software (FLOSS). There are different camps within this movement. You are free to either do your research and align yourself with one of them, or stay neutral. The deeper philosophical debate isn’t in this post’s scope.
Below are quotes by two dudes who are faces of this movement. They might not be on the same page at times but have done more for the world than most will ever know:
Linus likes to fuck around every now and then. No pun intended.
“Every great Open Source contribution consists of three parts or acts.
The first part is called “The Itch”. Perhaps there’s a bug which cropped up when the coder was working with a library, or maybe there’s some feature which the world needs but hasn’t been implemented yet. Better still, he just wants to ‘contribute’.
The second act is called “The Hustle”. The coder pulls his socks up, puts on his hackathon t-shirt and dives into the source code and documentation. Now, he’s looking for the solution… but he won’t find it, not soon anyway, because of course he’s overwhelmed. Eventually he will write the required code, update the documentation and send a PR. The world will see his contribution now. Well, not yet. Because writing the code in your branch isn’t enough.
That’s why every contribution has a third act, the critical part, the part we call the “The Merge”.”
Reasons why the itch might crop up have been detailed in the previous section. It might appear after a week or month-long buildup. Then again, it can also hit you out of nowhere one sunny afternoon as you are crouched in your air-conditioned office trying to implement a shiny new feature for your company’s ‘world-changing’ product or at midnight when you are working on a college assignment due next morning.
It can hit you because you want to make an impact, or because someone told you that Open Source can help you land your dream job. Simpler reasons could be to attend conferences or a desire to get your Github tiles to turn green.
Whether you are a noble technologist or a simple prick, commits and pushes don’t judge. Pull requests might.
If you know the library you want to work with, what bug you want to fix and the new feature that you need, that’s great! Get the library’s source code and documentation on your local system. Maybe open an issue, file a bug or feature request while you’re at it.
If you’re one of those who just want to ‘contribute’ but don’t where to start, pat yourself on the back for taking the first step. In a nutshell, you will need to zero in on the source code (library/package) and what you want to do with/in it. Maybe you want to help the guys at Mozilla or maybe you want to contribute to the popular machine learning library by Google.
These links will help you get started with scratching the itch:
– How to get started with OpenSource | HackerEarth
– How do I start contributing in Open Source projects? | Quora
– How to start contributing to Open Source | Developer.com
As for author’s most recent brush with Open Source, he wanted to contribute to Pandas, an amazing Python library for data analysis that he had been using for some time now.
He went to Panda’s GitHub repository, filtered open issues with the tag ‘difficulty novice‘ and found a bug which seemed interesting.
You are done searching and deciding the higher-level details. You know which project to contribute to and what bug-to-solve/feature-to-implement.
Buckle up because you’re just getting started. More often than not when working with an OS project, a large code-base or tons of documentation won’t be your first hurdle. Setting up the project or building it and running the tests can be a real pain though. Being on a UNIX-based system (Linux or Mac-OS) and being adept with the command line will come in extremely handy when doing this. Sorry Windows users, that’s just the way things are!
Most important tool you would ever use when developing Open Source software is git, a tool for version control. Version control enables multiple developers, hundreds or thousands at times, to work together on the same code-base. If you haven’t already, learn git.
Since tests have already been mentioned, it’s essential that you know this: they are important! In fact, tests are one of the most important components of any decently-sized software project. Think of it this way. In absence of reliable tests, we could end up breaking old features whenever we make changes. The developers would keep fixing bugs which could have been easily avoided and will eventually give up altogether. Many popular OS projects use a test-driven development (TDD) style and Continuous Integration (CI). Read up about CI and testing frameworks for your language of choice.
The drudge work has been done and the project has been set up on your machine. Ensure that you go through the project’s developer documentation (example). Once that’s done, we are all set to start writing code!
The coding can take time but it’s simple in principle. Write tests, if needed, and the required code.
At this point, the code has been written. It’s time to open a Pull Request and start working towards the third act.
After identifying the bug, he forked the Pandas repo and cloned it on his local machine. He went through the ‘Contributing to pandas‘ docs which detailed how to work with the code-base, installed dependencies, created a new branch, wrote the required tests and code, pushed to his fork on Github, opened a Pull Request and waited for a maintainer to review his PR.
At times, this will be a smooth ride. You would have done everything correctly and your PR will be merged without much fuss. Many a times, that won’t be the case; especially when you are starting out.
If the project uses CI, you will have to make sure that the required tests and checks pass. If the changes made by you aren’t up to the mark, you will be told by those reviewing the PR about what changes to make. This might happen multiple times but your hard work will pay off once the needful has been done and your PR will get approved.
This will be followed by the moment which you had been looking forward to all this while. Your contribution will get merged into the project. Hooray!
Changes had to be made to his PR multiple times after being opened. Getting all the checks to pass took couple of iterations of the process described previously. Then he was asked by the maintainer to add a few more tests just to be safe since a tricky case was involved. After a week of opening the PR, it got approved, the changes got merged and the author slept peacefully that night.
You might feel overwhelmed at this point and that’s all right. It’s a lot of information to process and Open Source might seem like a lot of work. But this is how great software gets built. Also, this is a great way to make an impact through your code and meet awesome people in the process.
Hang on and you will have the time of your life, mate.