The spreadsheet is an incredibly powerful application. Unfortunately, in its current form the spreadsheet is too complex for most users. We can do much better. In this post I will share a few ideas about how to expand the audience of consumers leveraging the power of spreadsheets.
First, a little spreadsheet history.
For the full story, check out A Brief History of Spreadsheets. The story in a nutshell is that the core of Excel [and just about every other spreadsheet] started in 1978 when Harvard Business School student Dan Bricklin envisioned an “interactive visible calculator.” He and Bob Frankston then invented VisiCalc.
In 1986 Lotus had an advanced technology group attempt to reinvent the spreadsheet. The idea was to separate data, views, and formulas – Lotus Improv. For any Excel junkie, this is a very powerful idea. Since VisiCalc, the spreadsheet user has been burdened by data and formulas sharing cells interchangeably held together by “spaghetti logic.”
Steve Jobs instantly recognized the power of the idea of separating data, views, and formulas and Improv first launched on the NeXT computer. It was arguably the killer application for the NeXT, which sold well into Wall Street. A version launched for Windows in 1993, but (according to Wikipedia) corporate users had trouble understanding Improv because it was so different from existing patterns. I own the last Windows version, and in addition to being so different from Excel, the product was extremely raw. There wasn’t even an “undo” feature. Imagine that. Microsoft took the idea of separating data, views, and formulas and launched pivot tables.
When I heard the rumor that Apple was working on Numbers, I was hopeful that Steve would reinvent the spreadsheet. Either by making the ideas from Improv usable by the average consumer or by doing something all new — like allowing cool mash-ups between web services and data in a user’s spreadsheet. But then Apple launched Numbers and I was disappointed. Apple made doing some things on a spreadsheet easier — they definitely made it easier to create attractive output (charts, tables, powerful print views). But Numbers didn’t really enable anything that I couldn’t do before.
Google’s web-based spreadsheet has done a very good job of implementing an AJAX version of Excel. They are on their way to making Microsoft Office irrelevant (they even use the same colors — blue for Word, Orange for PowerPoint, and green for Excel). Google recently added the ability to create charts, tables, and maps by mashing up spreadsheet data with web services like Google Maps. They have also opened up Spreadsheet to third parties to develop new “gadgets” for displaying data in new ways. But Google has not enabled the use of existing Excel macros — any power user of Excel needs macros. Thousands of companies around the globe have built up a massive amount of business logic leveraging macros.
Project Sage: Nobody is smarter than everybody.
The core idea I propose here is to think about spreadsheet design as having two (rather than one) core user segments — the creator and the consumer. The creator, like the creator of a web application, invests an enormous amount of energy in building models which include data, views, and formulas. The consumer is focused on viewing the spreadsheet and, hopefully, contributing bits of wisdom by playing around with the key assumptions in the model. Offer powerful tools to the creator and make it extremely easy for the consumer to participate by sharing their knowledge with minimal effort.
Most community based web products have a very small group of creators (often ranging in size from <1% to 5% of total users) and a massive group of consumers (everyone else).
+ Prediction Markets & Polls
In nearly every serious spreadsheet I have ever put together, there are many assumptions. Revenue growth rates, key income statement expense items, DSOs, discount rates, probabilities of certain events, and the list goes on and on. Many of those assumptions would benefit from the “wisdom of crowds.” The dirty secret of these complex models is that they are [almost always] highly sensitive to small changes in key assumptions. The more assumptions you have, the more precision you have. Unfortunately, the precision is likely fictitious.
Imagine allowing the spreadsheet creator to designate certain assumptions as editable by some larger group (the entire public or a group specified by the creator). When the creator shares the spreadsheet, everything is fixed except for assumptions. When viewers disagree with an assumption, they can change it and see what the model looks like with their assumptions.
There should be three views — the original view, the edited view (by the viewer), and the community view. On the community view, it would be fun to play with various statistical approaches (arithmetic mean versus geometric mean, a weighted mean based on trust or past performance).
It would also be cool to allow consumers to use a spreadsheet in an altogether different way — to create predictions about any outcome. What is the collective wisdom at the office about which project should be funded (I’m absolutely certain that distributing budgeting decisions to large groups inside a company would be vastly superior to today’s budgeting process experienced by the unfortunate masses of most large companies)? Which little league team is going to win the little league world series? Who is going to get married first of those in your social group? Which of the people we interviewed should we hire? Which private company is going public next? Who is going to win American Idol? On this last one, I developed an American Idol prediction market for the 2007 season with John Hayes — it’s over and you can’t see much, but it was here. It was extremely accurate with only ~200 weekly traders. Instead of having to work with an engineer to build a prediction market, imagine if anyone with basic spreadsheet skills could to do so?
+ Open Source Formulas
One of the powerful features of any spreadsheet is “formulas.” Need to calculate the present value of a potential investment? No problem, simply type this formula into your spreadsheet and replace the key variables with data or links to data. =PV(rate, NPER, PMT, FV, type)?
Why not open your service to third party developers in a self-service way to enable the creation of the world’s largest repository of formulas?
+ Macro Virtualization Engine
Microsoft did a very nice job with macros in Excel. So nice, that anyone who wants to give Excel a run for its money will either need to figure out how to run existing macros or be prepared to wait a decade to take over the enterprise. What if you could build a virtualization engine that enabled all existing macros to work in your spreadsheet?
What do you think?
16 responses so far ↓
dave mcclure // May 20, 2008 at 3:12 am |
great stuff mike. really interesting thought piece.
one additional thought: what about a ‘visualization view / design UI view’, or a ‘marketing / distribution view? either of these might provide ways for other actors to add value in areas that don’t change original data but might increase opportunities for consumption / conversion.
we’ve though about something like this for SlideShare where we enable the market to help experiment / optimize with multiple design options to increase adoption.
mspeiser // May 20, 2008 at 3:20 am |
Great idea Dave. Sort of a Greasemonkey for spreadsheet presentation?
kent goldman // May 20, 2008 at 3:53 am |
really interesting. would love to see this idea extended to include collective intelligence around the assumptions which are the underpinnings for so many models – and have such a great impact on the output of the models. the idea could be to establish an Excelapedia (Excel + Wikipedia) for all types of assumptions. for example, what do we collectively believe to be the growth rates for a particular industry? just check Excelapedia and get the wisdom of the crowds without having to pay thousands to get one person’s perspective in a research report.
Sam Pullara // May 20, 2008 at 4:01 am |
I think this is on the right track. Powerpoint suffers from similar problems since it also doesn’t separate the model and the view. Many times I have wished to be able to pull facts and bullets from another source so that my PPT wasn’t immediately out of date.
The first problem with posing this as “spreadsheet 2.0″ is that most current users of Excel can’t even learn the Mac version because they have trained themselves for all the shortcuts on the Windows version. Retraining them to think this way seems to be a much deeper divide.
Perhaps, instead of starting with a new desktop application, you start with a new data format. One of the things that is missing from the current syndication formats is support for efficient self-describing matrices. First come up with a solution for that, then use something like Y! PIpes to access it and combine it with other data sources. I was playing around with http://chartmechanic.com and one of the things I thought would be great would be to use Pipes (your virtualized functions) to process raw input data and munge it into the appropriate output format to be charted. There really isn’t an automated way to do that today.
Similarly, for very large datasets, we could use Hadoop to process them… it has the same underlying dataflow semantics and any Pipe can be compiled down into a set of mapreduce jobs.
As for Dave’s example, I think he is struggling with my Powerpoint problem. Why can’t I use raw data from Word and Excel and create a dynamic updating presentation? Shouldn’t Powerpoint just be structure + CSS?
mspeiser // May 20, 2008 at 4:12 am |
Thanks Sam. Very well said.
I agree that going after existing core Excel is probably not the right approach. One of the guys from the Improv team argues that’s why they failed. And I agree that separating the model from the view is a smart thing for just about any application, including Excel and PowerPoint.
The core idea I was trying to suggest here is that you could use the spreadsheet as an incredibly simple programming interface. Then by allowing things (like assumptions or probabilities) that can easily be munged together to be munged together by having consumers input their own assumptions, you could tap into the wisdom of large groups of people…
Jim Tybur // May 20, 2008 at 5:32 am |
Mike, first off, great to see you blogging in your new role as VC! On Sage, I specifically like the concept of prediction markets. A community spreadsheet is one good idea of where to incorporate them, broader “enterprise 2.0″ services may be another. If anyone is interested, check out HBS prof Andrew McAfee for his in depth thoughts on this, prediction markets are discussed toward the end of this thoughtful post: http://blog.hbs.edu/faculty/amcafee/index.php/faculty_amcafee_v3/how_to_hit_the_enterprise_20_bullseye/
John Beatty // May 20, 2008 at 4:55 pm |
I don’t know much about Excel, but it seems to have a “Web query” feature (http://msdn.microsoft.com/en-us/library/aa203721(office.11).aspx). So, it seems you could achieve some of your goals of collective wisdom-based assumptions with a simple website that computes collective wisdom (e.g. current prediction market price for an outcome) and publishes a URL with data compatible with Excel. Would this work?
Dave Brown // May 20, 2008 at 4:56 pm |
My favorite example from recent memory of tweaking inputs to a model for purposes of forecasting is this Buy versus Rent calculator published last year. It touches on a lot of themes I like, visualization and armchair economics:
http://www.nytimes.com/2007/04/10/business/2007_BUYRENT_GRAPHIC.html
I would love to know the aggregate of the crowd’s wisdom about the key assumptions of this model: annual home price appreciation, inflation expectations, etc, which could be reasonably inferred by recording what the crowd has punched in for those values.
I saw huge spreadsheets in financial firms, with complex VB logic doing number crunching that I thought would have been better suited in “traditional” software (C or Java). My reaction was that the use of excel had grown beyond it’s original intent, that excel has become an ad-hoc application platform for analytics. I agree that excel’s looseness of mixing data and formula can make it difficult to tease a model out of a given spreadsheet. But that looseness also makes it appealing: the user is often trying to solve a single, discrete problem (value a bond, forecast revenue), and so has little patience for a software engineer’s mindset of neat separation between data and model.
Attacking excel head-on may not be the right approach, but any product going after this space should be immediately accessible to an excel user – shortcuts too
. Most everyone on earth inclined toward modeling and number crunching is doing it today in excel, and probably has a lot of blood/sweat/tears invested in becoming an excel wiz.
There’s tons of room, though, to make data manipulation and transformation easier and more expressive. Some of the VB macros I’ve seen are just awful. I’ve been imagining what a more intuitive data transformation langauge might look like, and in particular how it would operate on large datasets.
Scott Fitchet // May 20, 2008 at 9:14 pm |
I’d love to see a simpler and more transparent way of discussing Sabermetric formulas.
Using something like Google Spreadsheets as a friendly query front end to historial baseball data might be a good teach kids (and adults) about calculus, strategy, and visual communication (because it’s fun !). Right now you A). have to pay large subscription fees for the data and B). need to know how to program in a specific database language … both of which limit the collective design process you were talking about.
Moneyball is easy if you’re the only one who has access to the data and the “magic” formulas like “walks plus hits per inning pitched”.
jolly // May 20, 2008 at 11:28 pm |
There’s no question that spreadsheets are powerful programming platforms, but I question the notion that a spreadsheet is a simple or intuitive metaphor.
(“The key to innovation is to challenge patterns”, right? – nice TieCon slides, Mike!
)
Maybe we’re all inured to it from familiarity, but I wonder how easy it is really for the absolute beginner who has never used Excel before to do anything productive with it. A blank spreadsheet offers no guidance as to how to proceed. Handling of erroneous data or formulas is somewhere between unhelpful to hostile. And the visualization (whether it’s spreadsheet formatting or charting) is downright bewildering.
In contrast, maybe there are simpler primitives that people could grasp more intuitively. For example, most people understand keeping lists of stuff. Maybe extend to that sharing lists and combining lists and maybe graphical representation of your lists. Extend that to sharing common “formulas” that can operate on the lists.
I’d wager that there are many interesting mini-apps that could be built simply with the metaphor of a list, or the slightly more complicated two column lists (e.g. name/value pairs, lists ordered by some ‘domain’ like time)
mspeiser // May 21, 2008 at 4:16 am |
Scott, I love it! Someone need to do Sabermetrics for every other sport, too…
mspeiser // May 21, 2008 at 4:20 am |
Jolly, killer argument. And I agree that debate leads to better outcomes. So on that note…
1. If only 1-5% of people need to be power users, there are more than enough people on the planet who understand Excel well. Many more spreadsheet capable professionals than programmers. And most people who would create the types of models that would benefit from the wisdom of crowds + some statistical treatment are highly likely to have mastered Excel.
The other 95%+ of users would only need to know how to point a mouse at an assumption (sales, probability, batting average) and change the number. That’s my argument, at least.
Dave Brown // May 21, 2008 at 2:25 pm |
I saw huge spreadsheets in financial firms, with complex VB logic doing number crunching that I thought would have been better suited in “traditional” software (C or Java). My reaction was that the use of excel had grown beyond it’s original intent, that excel has become an ad-hoc application platform for analytics. I agree that excel’s looseness of mixing data and formula can make it difficult to tease a model out of a given spreadsheet. But that looseness also makes it appealing: the user is often trying to solve a single, discrete problem (value a bond, forecast revenue), and so has little patience for a software engineer’s mindset of neat separation between data and model.
Attacking excel head-on may not be the right approach, but any product going after this space should be immediately accessible to an excel user – shortcuts too
. Most everyone on earth inclined toward modeling and number crunching is doing it today in excel, and probably has a lot of blood/sweat/tears invested in becoming an excel wiz.
There’s tons of room, though, to make data manipulation and transformation easier and more expressive. Some of the VB macros I’ve seen are just awful. I’ve been imagining what a more intuitive data transformation language might look like, and in particular how it would operate on large datasets.
Dave Brown // May 21, 2008 at 2:27 pm |
Oh and my favorite model from recent memory that allows you to tweak input assumptions is the nytimes buy vs. rent calculator: http://www.nytimes.com/2007/04/10/business/2007_BUYRENT_GRAPHIC.html It touches on a lot of themes I like, data visualization and armchair economics. would love to know the aggregate of the crowd’s wisdom about the key assumptions of this model: annual home price appreciation, inflation expectations, etc, which could be reasonably inferred by recording what the crowd has punched in for those values.
Dave Brown // May 21, 2008 at 2:33 pm |
My favorite example lately of tweaking inputs to a model is the buy/rent calculator on the New York Times website last year: http://tinyurl.com/2sdtvd
I’d love to know the sum of the crowd’s wisdom about the key assumptions of this model: annual home price appreciation, inflation expectations, etc, which could be reasonably inferred by recording what the crowd has punched in for those values.
Sean // June 4, 2008 at 5:53 pm |
The other vector I thought of is around BI analytics. Crunching financial models in Excel has been replaced for me with crunching salesforce.com and business data from BI warehouses. Business Objects is doing some cool stuff with http://www.crystalreports.com to share reports and offer SaaS access to data providers (thompson financials), I think extending that down to the spreadsheet level would be a killer application.
Some of it is going to be company or product line specific, but there is a lot of commonality particularly for reports that use shared data providers (salesforce.com, workday or other SaaS ERP products) etc.