In Python we trust

In this pursuit of becoming independent as a data professional, the first thing that comes to our (in American accent “our” sounds like “R”) mind consists of building the full stack on our own, with Python. Either one uses Scala, Python or R. The problem is common: these languages are amazing for working with data; they have plenty of free libraries available for data that are basically destroying old proprietary models such as that monopoly that Matlab once held. But they are not so shinny to build a full-stack service.

‍

Note I am not saying that this is not possible, I am just pointing out that there is not any single full-stack PWA solution built out there with Python (nor the other languages mentioned) by any top tech company. It is true that you can visualize beautiful analytics with Shiny, Streamlit, or Plotly’s Dash, you can build a REST API with Flask, but it is not real end-to-end independence, just some more autonomy, so the overall workload is displaced but not reduced. There is not any success case out there besides Medium blog posts that merely show some prototyping facets. Let’s dig deeper into the topic.

‍

The mermaids: Plotly, Streamlit and others

Plotly is probably the most amazing visualization library I have worked with. Moreover they have another library called Dash, that allows anyone to build a web-app, with Python! It seems to be the solution data people were waiting for. Nevertheless, there is a key drawback: you still need software engineers to build your data app.

‍

You do not need a Front-End that speaks a different language anymore, but one needs to know or learn Front-End technology: callbacks, input, output, etc (programmed with Python, yepe)

‍

and still one has to build the data storage architecture and all the DevOps so that everything works (or pay Plotly to help you with this last part). Plotly means more freedom for Python developers; freedom that all of us dreamt of, but not enough freedom for data app creation.

‍

‍

Look at this other image: you still need a Back-end and DevOps with Streamlit or Plotly Dash and a Python developer has to go through all the learning curve of the front end (becoming eventually a pseudo Front-end). The benefit is that you virtually reduce 1 profile to create a data app and your data engineer gains autonomy.

‍

Streamlit offers easy to prototype dashboards - they are not really handsome, they are not the sort of data apps you can really sell or share with that 40% of the market that wants to share data with their clients. And let’s be crystal clear here, we are talking of data products, not of sharing results with fellows anymore. For data to progress as an area we need to have data-based products, and this requires sharing results in a professional way, with the latest back-end, front-end technologies. Streamlit is nice to share with your friends and to convince your boss to invest some more resources, but it is far from offering a real deal to create a data app you can possibly sell.

‍

The problem with Shiny is the same. Hence, one ends up needing to go through a hard learning curve to gain that autonomy, and still it is not full autonomy; you still miss one leg to run.

‍

BI tools, the wrong wildcards

BI tools such as Tableau or PowerBI are often used as a last option when building a data app product. They cannot scale, they cannot offer real UX to your clients (think of the difference between navigating through Google Analytics and your last Tableau app).

‍

You do not need engineers for BI tools, but they can be used by few users simultaneously and they grow linearly with your data department size (you need 2x data analysts to offer 2x BI apps). Basically you need an army of data analysts and computers of the size of a building to try such a thing. Eventually (when you give service to more than 10 clients) it becomes the least profitable of all the scenarios.

‍

They are tools for internal use, and one cannot create data products from them (if we understand data products as assets that could be sold at a scale). Manual work to keep dashboards updated, adapt them and change them only scales linearly with the number of data analysts hands that you have. Moreover, the outcome can only be offered to a small number of users before your architecture cost scales and devours any possible profit.

‍

The mermaids: Plotly, Streamlit and others

BI tools, the wrong wildcards

Other post from Shimoku