PANDAS

Pandas is a Python library developed by Wes McKinney. The USP of Pandas is that it provides “Data Frame” environment in Python.  Dataframe is one of the commonly used data structures in R. The first thing that pains you when you start coding in Python , after having worked in R is that – “there is no readily available data frame object” . Even though there is NumPy ndarray object, it is not as flexible and extensible as the dataframe in R.  Pandas library addresses this problem and provides DataFrame and a ton of associated functionality that can be used for data munging, data cleaning and interactive data exploration. If you look at random data cleaning code written using Pandas and R, they will look very similar. However pandas is like R’s dataframe on steroids. Pandas also has some preliminary graphing capabilities using matplotlib.I think as the library matures, it will be a default module in any data analyst’s toolkit.