R vs Matlab vs Python
This post is planned to be an ongoing thought process. I’ve used Matlab when doing my MSc in Intelligent Systems and Robotics at De Montfort University Centre for Computational Intelligence. When I started using it I thought ‘wow this is cutting edge’ and enjoyed using it (apart from constant alt-tab between windows). So far I’ve used Matlab for:
- Developing (and teaching) Fuzzy Log, both GUI and code
- Developing (and teaching) Artificial Neural Networks (Perceptron, Pattern Net, etc) using the KDD 1999 Network Data
- Robotics Simulation (iRobot Create)
Then during the MSc dissertation project I was forced to move over to Python specifically developing a Convolutional Neural Network connected to a Multilayer Perceptron (sounds awesome, was difficult). Anyway this lead to spending the full summer getting to grips with Python but for a very specific reason. Later I revisited Python after watching some videos on Time Series using Pandas, this time I started using the iPython Notebook interface. That is when I started liking Python, I put together a couple of simple scripts looking at K-Means clustering and then a simply perception model, and they just worked.
So now I’m on R, why? Because during PhD one of my supervisors has sent me a tutorial on data analysis for clincal data in R, as simple as that. Plus when looking at any machine learning job (post PhD) the common software/languages they ask for is Matlab and R, with an emerging trend for Python. Plus it’s never a bad think learning a new language.
So I thought I’d write a blog about my journey on using the 3. Because I’m still deciding which path to take I am current playing in both R and Python, I’ve note excluded Matlab I have just it more than these. Plus my first supervisor describes trying the techniques in different languages as “good for the soul”.
Hopefully we all know already, but Matlab is propriety meaning you’re going to have to hand over some of your money. Not only money for the main software but also for every subsequent ‘toolbox’ like the Fuzzy Toolbox, the Neural Network Toolbox, the Parallel Processing Toolbox, etc. The good news is that they do a student version (not sure if its 64bit, mine was only 32bit) which is cheaper at around £50 I think, with each toolbox costing roughly £15. I might be wrong on these prices.
Currently I have access to a site license, fantastic. But it has been a bit of a pain to set up. This is not Mathworks fault, just my own for not reading the instructions.
So I got around to installing Matlab, this time using the site licence and reading the instructions. Initially installed the version supplied (2012a) but once I registered it I saw I had full access to later versions so I put that on. At this point I thought “that’s nice” and switched it off.
Python and R are free, woo hoo. Including all of the extra function (like Matlabs toolboxes). So you can Machine Learn it up all night long for FREE. I’ve been using the Anaconda Python Distribution and it has everything, and being a student I can get the parallel processing package for free (Anaconda Accelerate).
Interface and general usage
I don’t really have much to say on this. I’ve been using RStudio for R and its very similar to Matlab. You have a window for the file you are coding, you can run the whole file or just subsections. It has an area for graphs to be displayed. RStudio just works. Matlab is similar, last time I used it I got annoyed with the undocked floating windows that kept getting lost. You could probably change that but I didn’t bother looking. With Python I’ve been using iPython Notebook, which I think is fantastic. Everything it kept together in blocks and like RStudio you can run one block or them all.
My main frustration with Matlab is buying the toolboxes. You have to go online, buy it, then install it. Firstly I don’t like the thought of buying a toolbox. Not because I’m tight with the cash but because I like to play, and I don’t like to buy something I might just play with once or twice. With Python, if you use the Anaconda package everything you need is already installed. With R and RStudio, if you want something you just click on the install link, type in the name of the package and click go. I wanted to play with some Kaplan Meier estimation stuff, not only did R have a package it had another package with sample data. The main downside is you need to know the name of the package you one. Hasn’t been a problem so far though.
The winner so far
Remember, I am continually updating this so keep checking back.
So far my winner is RStudio. I’ve separated RStudio from R because the R GUI/Prompt looks like a pain to work with.
I’m a big fan of Python, SciPy, NumPy and SciKit Learn but industry still seems to be using R and Matlab. This might change, I am seeing more and more job adverts looking for Python skills, so we’ll see. I definitely would like to see it become more mainstream.
Matlab, you make me feel sad. I have access to a full licence with all the toolboxes but I am reluctant to invest time using you because I know when I leave university you will make me pay.
UPDATE – Each to there own, check this out for a summary of all 3 (easier to digest).