How the ‘why’s drove the ‘what’: Epilogue

“Study hard what interests you the most in the most undisciplined, irreverent and original manner possible” – Richard Feynman As a deeply confused and somewhat optimistic sophomore, I was in a habit of taking witty quotes more seriously than most. The one above, for example, has guided how I have went about studying Machine Learning and related topics for the last two years or so. Then again, as a chemical engineering major in an Indian college with a tragically rigid curriculum, I didn’t have much of a choice. Fast forward a couple of years and after a few online courses, … Continue reading How the ‘why’s drove the ‘what’: Epilogue

Analyze pull requests and Travis builds using Rperform

Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. – Martin Golding In previous posts, I had discussed how Rperform can be used to obtain and visualize package performance data. However, real-world software development is a collaborative process. Thus, automating performance testing for your package is not only a good idea, it’s a critical one; testing projects locally might not be good enough. This post will cover usage of Rperform with Travis CI for automated performance testing. More importantly, we will be able to assess performance impact … Continue reading Analyze pull requests and Travis builds using Rperform

GSoC 2016 Report – Rperform

Developer: Akash Tandon Mentors: Joshua Ulrich, Toby Dylan Hocking Official Project Link: Rperform: performance analysis of R package code This project meant to deal primarily with development of Rperform’s functionalities to allow developers to obtain potential performance impacts of a pull request (PR) without having to merge, extension of the package’s existing performance metric measurement and visualization functions, and development of a coherent user interface for developers to interact with. About Rperform Rperform is a package that allows R developers to track and visualize quantitative performance metrics of their code. It focuses on providing changes in a package’s performance metrics, … Continue reading GSoC 2016 Report – Rperform

Data Science Competitions 101: Anatomy and Approach

I recently participated in a weekend-long data science hackathon, titled ‘The Smart Recruits’. Organized by the amazing folks at Analytics Vidhya, it saw some serious competition. Although my performance can be classified as decent at best (47 out of 379 participants), it was among the more satisfying ones I have participated in on both AV (profile) and Kaggle (profile) over the last few months. Thus, I decided it might be worthwhile to try and share some insights as a data science autodidact. The problem The competition required us to use historical data to create a model to help an organization … Continue reading Data Science Competitions 101: Anatomy and Approach

Obtaining package performance data using Rperform

“In God we trust. All others must bring data.” – W. Edwards Deming In a previous post, I had discussed how Rperform uses the grammar of graphics approach to visualize an R package’s performance in terms of runtime and memory usage. The visualizations contribute significantly towards Rperform’s mission to allow package developers to quantify, analyze and visualize performance. However, at times you, the developer, might want to play with the data instead to perform analysis of your own. After going through this post, that is exactly what you would be able to do. Background If you are new to Rperform, consider … Continue reading Obtaining package performance data using Rperform

Visualizing package performance using Rperform and Grammar of Graphics

The greatest value of a picture is when it forces us to notice what we never expected to see. ―John Tukey Replace ‘picture’ in the above quote with ‘data visualization’ and it will still ring true; maybe even more so. To provide valuable insights to package developers is exactly what Rperform strives to do through it’s visualization functions. Background If you are new to Rperform, consider going through it’s Github README once. In a nutshell, Rperform is an R package that allows package developers to track and visualize quantitative performance metrics of their code, over time. It focuses on providing … Continue reading Visualizing package performance using Rperform and Grammar of Graphics

Rperform in Google Summer of Code 2016

Rperform had started as a GSoC 2015 project with an aim to “to provide a package with functions that make it easy for R package developers to track quantitative performance metrics of their code, over time.” Much of the functionality required for the same was implemented over the course of last summer. This included various performance visualization functions and integration with the Travis-CI workflow, among other things. The project has been accepted into the GSoC program again under the organization, R project for statistical computing. I will be working on it over the summer with my mentors, Toby Dylan Hocking … Continue reading Rperform in Google Summer of Code 2016