- Published on
Will Python replace Excel?
The answer is, er, complicated....
At a meetup I attended in Manchester earlier this year (shout out to the North West Ruby User Group) I had an interesting conversation with a guy who was the founder of a startup. When I asked about his business model, he looked a little sheepish and explained that it was a SaaS offering that allowed clubs and schools to manage subscriptions and billing.
Like many, when I hear "startup" I immediately think of companies that are going to disrupt the delivery of new contact lenses by autonomous drone before you even know that you've run out by using ML models based on how many times you've eaten carrots or listened to Art Garfunkel. OK so I made that up but if you end up making money out of it, I will be asking for equity.
There was no need for this guy to feel embarrassed about his business model. We talked for a while and it seemed, after a tough start, that he and his partners were doing well. Given the current climate, I worry a little about their cash flow but then no one foresaw this.
Pandemics aside, one thing he said stuck with me which, to paraphrase, was something like:
Behind every spreadsheet there's a potential business opportunity
I liked this way of looking at things. Even though it wasn't an original insight, it made sense to me as someone who has spent years working with a tool with an unofficial slogan of "Escape Excel Hell" for decades. Yet, in 2020, there's still quite a lot of spreadsheet misery going around.
A few months ago, I listened to this episode of the excellent Talk Python to Me podcast. It was great, and I don't mean to criticise it or guest Chris Moffitt. Indeed, his blog Practical Business Python looks to be a great resource and I don't think he's suggesting Python will replace Excel either. That said, I have heard people putting this idea forward though, or at least versions of it.
Why? What's stopping people from switching to Python?
I can think of a few reasons actually which I've listed below. I should point out that I don't have any skin in this game, indeed I don't use Excel much, but I think spreadsheets have a place. I also know loads of accountants.
Inertia
This is the obvious one. I've read that Excel alone has over 750 million users. That's a lot and between them, they're managing spreadsheets that may (literally) date back decades. I've worked on numerous projects where the stated aim is to eradicate a collection of bloated, poorly understood spreadsheets. It's hard to unpick them, and that's from someone who is paid to do so. It's not feasible to expect people to migrate away from these legacy tools and workflows without significant (i.e. expensive) outside support. Furthermore, people just don't like change. Quite frankly, the idea that many people have the appetite to change is a bit ridiculous.
It's not just a new tool, it's a new paradigm
When I read articles about this Python-based revolution, many focus on comparing Pandas to Excel for manipulating and analysing data. Leaving aside the difficulty in learning a new tool, this feels like comparing apples to oranges.
A spreadsheet provides what is essentially a functional programming environment with pretty strict constraints. Python meanwhile, is an interpreted programming language with a vast ecosystem. Excel is like a trojan horse that tricks people into programming without realising it, Python is much more than that.
I assume that most proponents of this idea imagine people transitioning to using Pandas in Jupyter notebooks. Sure, it's tabular data but it's still a different way of thinking. Part of what makes spreadsheets so intuitive is that you can always "see" the value of a variable (i.e. a cell) even if it is nested into some horrific formula. A variable in a Python notebook is a mysterious thing for the uninitiated.
Some pundits seem to think that there's a whole new generation entering the workforce that has learned Python. Some will, but a lot more will have spent just enough time on Stack Overflow to submit an assignment. For those in entry-level roles, I'd suggest even the latter will be rare.
Just because it makes sense for one use case, doesn't mean it will for all
Most of the posts I read also seem to follow the pattern of "I replaced my Excel spreadsheet with Python so everyone should". This doesn't transfer across domains. I also think people underestimate how ubiquitous spreadsheets are!
Like it or not, people use spreadsheets for things as simple as to-do lists. Sure, there are arguably better tools for this, but again and again, I see people who have tools like Trello still use a spreadsheet instead Either way, that isn't going to be solved by Python.
The transition proposed is really to a different way of modelling, not just a different tool
Some of the reasons I've heard people offer as reasons to choose Python over Excel are describing the process of transitioning to using a completely different way of modelling.
Moving from doing a simple assumption-based forecasting model in Excel to a fully-fledged ML model trained on years of historical data isn't a question of tooling. Of course, Python is a better choice here but you're talking about completely changing your process.
Visualisation and Integration with the dreaded PowerPoint
Python has an amazing list of charting options which I won't go into here but these aren't a drop-in replacement for charting in Excel. Indeed, one of the real advantages of the Microsoft stack in this space is how uniform the UX is across several tools like Excel, Access, Power BI and even SSAS.
Many Excel users also need to ultimately get their visuals into PowerPoint which, as much as I loathe it, is still all-pervasive in some organisations. Sure, there are ways to do this with Python but they're non-trivial. If anything, I'd say that Power BI is the real growth area in this space.
The maintainability fallacy
A former colleague even suggested that Python is somehow inherently easier to maintain than an Excel spreadsheet. I think this is a little naive as anyone who has tried to unpick bad code, in any language, should understand.
You can write horrible Python and managing environments and re-producibility with Python is in some ways harder than Excel given the need to manage different Python environments and dependencies.
Sure, raw Python is easier to check into git than a spreadsheet full of formulas but try explaining what git is to a junior accountant, I dare you. That's after you've explained how to install their project's dependencies with pip.
Lots of people still only have a hammer
When all you have is a hammer, everything looks like a nail
Most people have come across this phrase, or something like it, also known as the Law of the Instrument. This is abundantly true with Excel too. I've seen all sorts of things implemented in Excel, some horrible, some admirable. However, I don't think it's a given that the toolbox is about to expand any time soon.
Python on Windows has improved out of sight but it's still not trivial for the average Excel user. Moreover, in many corporate environments, there are significant barriers to adoption. A common on-ramp is a tool like Anaconda but in my experience, this isn't always easily available, or supported, in big organisations.
So what is going to happen then?
I'm not sure but I don't see a new dawn of Python domination in this huge space. I think there will be a shift from a dependence on locally run Excel stacks to cloud spreadsheet offerings. Google Sheets raised the bar when it came it to concurrent use and may have stolen some ground from Microsoft but I suspect Excel will outlive me.