Right from my undergrad days when I was starting out with machine learning to this date, my admiration for Kaggle continues to grow. In addition to being synonymous with and popularizing data science competitions, the platform has served as a launching pad and breeding ground for countless data science and machine learning practitioners around the world, including yours truly. In fact, skills I’d picked up from the platform are part of the reason that I recently got to join SocialCops, a company I’d admired for years. However, I hadn’t been on the platform in 2017 as much as I would … Continue reading Kaggle Learn review: there is a deep learning track and it is worth your time
DISCLAIMER: Before you begin, be aware that the protagonist of any fictional account that this write-up might contain would be a male and would be referred to as a ‘he’ or at times, simply as ‘coder’. That’s not because the … Continue reading Open Source: The itch, the hustle and the merge
Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. – Martin Golding In previous posts, I had discussed how Rperform can be used to obtain and visualize package performance data. However, real-world software development is a collaborative process. Thus, automating performance testing for your package is not only a good idea, it’s a critical one; testing projects locally might not be good enough. This post will cover usage of Rperform with Travis CI for automated performance testing. More importantly, we will be able to assess performance impact … Continue reading Analyze pull requests and Travis builds using Rperform
Developer: Akash Tandon Mentors: Joshua Ulrich, Toby Dylan Hocking Official Project Link: Rperform: performance analysis of R package code This project meant to deal primarily with development of Rperform’s functionalities to allow developers to obtain potential performance impacts of a pull request (PR) without having to merge, extension of the package’s existing performance metric measurement and visualization functions, and development of a coherent user interface for developers to interact with. About Rperform Rperform is a package that allows R developers to track and visualize quantitative performance metrics of their code. It focuses on providing changes in a package’s performance metrics, … Continue reading GSoC 2016 Report – Rperform
I recently participated in a weekend-long data science hackathon, titled ‘The Smart Recruits’. Organized by the amazing folks at Analytics Vidhya, it saw some serious competition. Although my performance can be classified as decent at best (47 out of 379 participants), it was among the more satisfying ones I have participated in on both AV (profile) and Kaggle (profile) over the last few months. Thus, I decided it might be worthwhile to try and share some insights as a data science autodidact. The problem The competition required us to use historical data to create a model to help an organization … Continue reading Data Science Competitions 101: Anatomy and Approach
“In God we trust. All others must bring data.” – W. Edwards Deming In a previous post, I had discussed how Rperform uses the grammar of graphics approach to visualize an R package’s performance in terms of runtime and memory usage. The visualizations contribute significantly towards Rperform’s mission to allow package developers to quantify, analyze and visualize performance. However, at times you, the developer, might want to play with the data instead to perform analysis of your own. After going through this post, that is exactly what you would be able to do. Background If you are new to Rperform, consider … Continue reading Obtaining package performance data using Rperform
tl;dr Code for scraping GSoC data from Google Developers website for the years 2005-08 and storing it as csv files can be found here: GSoC data digging on Github Two csv files will be created: org_numbers.csv (containing information about the … Continue reading Digging into GSoC Data: Scraping 101 (Part 1)
For the last two years or so, I have been involved with Python in an on-off relationship of sorts. I possessed syntactical familiarity, had completed a MOOC and couple of web projects using Django. I had also read up on … Continue reading Python in the open!