Frameworks and Libraries
What are Frameworks ?
In computer programming, a software framework is an abstraction in which software providing generic functionality can be selectively changed by additional user-written code, thus providing application-specific software.
What are Libraries?
In computer science, a library is a collection of non-volatile resources used by computer programs, often for software development. These may include configuration data, documentation, help data, message templates, pre-written code and subroutines, classes, values, or type specifications. In IBM's OS/360 and its successors, they are referred to as partitioned data sets.
A library is also a collection of implementations of behavior, written in terms of a language, that has a well-defined interface by which the behavior is invoked. For instance, people who want to write a higher-level program can use a library to make system calls instead of implementing those system calls over and over again. Also, the behavior is provided for reuse by multiple independent programs. A program invokes the library-provided behavior via a mechanism of the language. For example, in a simple imperative language such as C, the behavior in a library is invoked by using C's normal function call. What distinguishes the call as being to a library function, versus being to another function in the same program, is the way that the code is organized in the system.
TOP 15 DATA SCIENCE LIBRARIES
NumPy
The most fundamental package, around which the scientific computations stack is built, is NumPy, (stands or Numerical Python). It provides an abundance o useful features for operations on n-arrays and matrices in Python.
SciPy
SciPy is a library of software for engineering and science.
Again you need to understand the difference between SciPy Stack and SciPy Library. SciPy Stack contains modules for linear algebra, optimization, integration, and statistics.
The main functionality of the SciPy library is built upon NumPy, and its arrays thus make substantial use of NumPy.
Pandas
Pandas is a Python package designed to do work with "labeled" and "relational" data simple and intuitive.
Pandas is a perfect tool for data wrangling. It is designed for quick and easy data manipulation, aggregation, and visualization.
Matplotlib
Another SciPy Stack core package and another Python Library that is tailored for the generation of simple and powerful visualizations with ease is Matplotlib.
It is a top-notch piece of software that is making Python (with the help of NumPy, SciPy, and Pandas) a cognizant competitor to such scientific tools as MatLab.
Seaborn
Seaborn is mostly focused on the visualization of statistical models; such visualizations include heat maps, those that summarize the data but still depict the overall distributions.
Seaborn is based on Matplotlib and highly dependent on that.
Bokeh
Another great visualization library is Bokeh, which is aimed at interactive visualizations.
In contrast to the previous library, this one is independent of Matplotlib.
The main focus of Bokeh, as we already mentioned, is interactivity and it makes its presentation via modern browsers in the style of Data-Driven Documents.
Plotly
It is rather a web-based toolbox for building visualizations, exposing APIs to some programming languages (Python among them).
There are some robust, out-of-box graphics on the plot.ly website. To use Plotly, you will need to set up your API key.
SciKit-Learn
Scikit is additional packages of SciPy Stack designed for specific functionalities like image processing and machine learning facilitation.
Regarding the latter, one of the most prominent of these packages is scikit-learn.
The package is built on the top o SciPy and makes heavy use of its math operations.
Theano
Theano is a Python package that defines multi-dimensional arrays similar to Numpy, along with math operations and expressions.
The library is compiled, making it run efficiently on all architectures. Originally developed by the Machine Learning group of Université e Montréal, it is primarily used for the needs of Machine Learning.
Keras
And finally, let's looks at the Keras. It is an open-source library for building Neural Networks at a high-level of the interface, and it is written in Python.
It is minimalistic and straightforward with high-level i extensibility.
It uses Theano or TensorFlow as its backends.
NLTK
And finally, let's looks at the Keras. It is an open-source library for building Neural Networks at a high level of the interface, and it is written in Python.
It is minimalistic and straightforward with high-level I extensibility.
It uses Theano or TensorFlow as its backends.
Scrapy
Library for making crawling programs, also known as spider bots, for retrieval of the structured data, such as contact info or URLs, from the web.
It is open-source and written in Python.
It was originally designed strictly for scraping, as its name indicates. but it has evolved in the full-fledged framework with the ability to gather data from APIs and act as general-purpose crawlers.
Statsmodels
As you have probably guessed from the name, stats models is a library for Python that enables its users to conduct data exploration via the use of estimation of statistical models and performing statistical assertions and analysis.