šŸš€āš”New open-sourceāš” VS. old open-source šŸ¦–

Marine - Nov 27 '23 - - Dev Community

TD;LR

In this article, I provide alternatives to mainstream Python libraries.
These alternatives add some value to the Python landscape even though mainstream libraries are supported by stronger active communities.
Choosing your libraries comes down to your use case and personal preference.

Gandalf


1.Taipy instead of Streamlit

Taipy is the new kid on the block. Just like Streamlit, Taipy provides an easy way to build interactive GUI; a simple Python app builder.
However, Taipy addresses most of Streamlitā€™s limitations/inefficiencies:

  • Manages both synchronous/asynchronous calls
  • Full notebook compatibility
  • Multi-user
  • There are more customization capabilities for your layout, styling, etc. (no CSS needed)
  • Big Data Support
  • Better performance

Taipy


QueenB stars

Star ā­ the Taipy repository

We appreciate any kind of help to help us grow our community šŸŒ±


2.Polars instead of Pandas

Polars is inspired by Pythonā€™s royalty: Pandas. Like it, itā€™s a DataFrame library created to handle data, but it really shines when processing large datasets.
Polars is faster than Pandas by a factor of 10 to 100 for two main reasons:

  • Polarsā€™ built-in parallel processing
  • Being written in Rust

Will Polars replace Pandas? Only time will tell.

Polars

Check out Polars


3.Dask instead of PySpark

Dask can handle larger-than-memory computations combined with parallel computing.
It is a great tool when you want your calculations to scale. It is written natively in Python, making it a breeze to learn/use (for Python developers).
It is not designed for super large big data (over or 2 TB), nor is it competitive (with Spark) if you are dealing with SQL-like queries.
Perfect for laptop executions.

Dask

Check out Dask


4.LightGBM instead of XGBoost

Both XGBoost and LightGBM are gradient-boosting libraries.
XGBoost is a Kaggle favorite, but when it comes to handling large datasets, LightGBM is optimized for Big data with parallel computation.

LGBM

Check out LightGBM


5.PyCaret instead of Scikit-learn

Like Scikit-learn, you can perform Machine Learning tasks with PyCaret.
PyCaret showcases its functionalities through simpler code, a great way to get started with ML learning projects.
PyCaret is simple and easy to learn. Some of its high-level functionalities are:

  • EDA & Data Processing
  • modeling / Training
  • Model Explainability
  • Model Deployment

Its end-to-end coverage of the various machine learning steps makes PyCaret a great tool for ML enthusiasts or even senior Data Scientists with no time for deeper analysis!

Pycaret

Check out PyCaret


6.Darts instead of tsfresh

Both libraries are dedicated to time series. However, they serve different purposes.

Darts is the ā€œsklearnā€™ of time series. It covers all the different functions a DS needs when dealing with time series:

  • Data Discovery
  • Data Preprocessing
  • Forecasting
  • Model Evaluation / Selection

No need to use several libraries anymore; it is all available in Darts.

tsfresh is about automating one of the most challenging steps when preparing time series for an ML training step: feature extraction and selection.

tsfresh can extract a large panel of features from your time series and help you identify the relevant ones.

Darts

Check out Darts


7.PyTorch instead of TensorFlow

Both are the go-to libraries for data scientists and researchers involved with deep learning.
TensorFlow was the prevalent library a few years back, but from 2020 to 2021, PyTorch has caught up with TensorFlow.

How do you choose between these two incredible libraries?

PyTorch seems to have an edge in research with a bigger focus on NLP.
Additionally, PyTorch has a more pythonic feel with an easier learning curve.
I would recommend giving PyTorch a go if youā€™re new to the deep-learning game; otherwise, both libraries are on par.

Pytorch

Check out PyTorch


8.Arcade instead of Pygame

In the Python 2D gaming scene, Pygame has acquired a solid reputation, while Arcade, a newer but well-established library, stands out on these properties:

  • built-in game loop
  • efficient event model
  • more features
  • more user-friendly

Both libraries have their own advantages; however Arcade is a more suitable option for beginners.
Pygame does offers an educational alternative, Pygame Zero, a better option for new developers.

Arcade

Check out Arcade


9.spaCy instead of NLTK

NLTK is the mainstream library for Natural Language Processing and has a plethora of functionalities.
However, with more complexity comes a steeper learning curve. SpaCy is a good option for getting started in the field.
The other big advantage of SpaCy is that it was built to optimize NLP applications, focusing on greater speed and efficiency.

Spacy

Check out spaCy


10.Ruff instead of Pylint

Linters are an essential part of any coding journey.
Pylint is widely used, but Ruff adds effectiveness and speed to the process.
It is known to be 10-100 times faster than equivalent linters, Ruff is definitely a good library to check out as a Pylint alternative.

Ruff

Check out Ruff


I hope you enjoyed this article!šŸ™‚
Iā€™m a rookie writer and would welcome any suggestions for improvement!
Feel free to share if you have favorite libraries that you prefer over more mainstream ones.

new

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .