Google Organize
August 9, 2020 1:57 AM   Subscribe

Social Movements Are Pushing Google Sheets to the Breaking Point (Medium soft-paywall) by Cindy Yu: "The proliferation of viral Google Sheets and Google Docs that break is a sign that collaboration has outgrown the collaboration tools at our immediate disposal. As the demographic of organizers and contributors has broadened and the scale of these projects has exploded, tools everyday citizens can use to spearhead these efforts have yet to catch up."
posted by adrianhon (41 comments total) 27 users marked this as a favorite
Considering the limitations of today’s most widely known tools for collaboration

Wait, the paragraph that preceded this statement was an interview with Google Sheets decades long product manager explaining exactly why it wasn't a limitation that was imposed on technology but simply the flexibility of a Google Sheets is not good for what they're trying to do. But the thesis seems to be that the tool itself is the problem. Of course collaboration built for its intended purpose scales well, Facebook and Twitter or however this spread managed to spread it effectively because they weren't an open ended spreadsheet, instead users are restricted on how they can interact with the system.

I don't know the exact functionality of this, but I doubt it would be hard for someone to throw together a simple Python program to mimic this and scale, put it on AWS and be done with it in 2 days at tops. Most of that time figuring out what the users are actually trying to do (requirements gathering).
posted by geoff. at 2:23 AM on August 9 [2 favorites]

Sorry I commented too soon, it get worse. The article says basically what I stated, then said we need a no-code solution. I'm going to go on a limb and say if you are hitting scaling issues, someone in that group knows how to do basic coding, but I digress.

The author then says there is a no-code solution, using a Google Form with a locked Google Sheet. But it is "not intuitive" which I guess, okay? Again, this is by definition a highly educated and tech literate subset of the population, and not one of this huge crush of people knew about Google Form? Or have we become spoiled we don't remember that a lot of technologies don't know they have scaling problems until it too late. Remember Twitter going down all the time? Even Metafilter slowed to a crawl around 10AM ET. Sometimes you don't know you run into scaling problems until you run into it, maybe we got so good at solving 90% of web scaling problems we forgot that scaling is a thing.
posted by geoff. at 2:30 AM on August 9

I think you’re being overly reductive and dismissive of problems faced by technically-savvy people (some of whom I know are pretty decent programmers) who cannot just come up a replacement in 2 days.

Also, using Google Forms and a view-only Sheet is not a full solution, as the article clearly points out: “Kaluvai may have been able to avoid some of the difficulties.” In my experience it slows data entry and editing down considerably, especially if there’s a lot of complex data that can’t be gathered easily via a form.
posted by adrianhon at 3:17 AM on August 9 [4 favorites]

Maybe this'll get the ball rolling for adoption of decentralized p2p collaboration alternatives for social justice stuff?
posted by efskap at 4:09 AM on August 9 [8 favorites]

There seems to be an awful lot riding on Google's willingness to keep things running. When they decide they are bored with it, which they well may, see ya later, and they will be off to the google graveyard. There are plenty if alternatives, but no one uses them.
posted by nothing.especially.clever at 5:06 AM on August 9 [7 favorites]

Again, this is by definition a highly educated and tech literate subset of the population

Is this coming from a similar place as pronouncements that the youth of today are "tech natives" and know how to do everything technological?

Highly educated =/= tech literate (talk to any group of university faculty)
"Tech literate" often =/= knows how to do non-standard things with tech, or knows anything about coding
Uses available tech tools for organizing =/= tech literate (much respect for the amazing organizing work going on, but oh my gosh the majority of people in my local activist community are not particularly "tech literate" if that means having any idea how things are working in the background, beyond just basically how to use the tools; some pretty good art and music, though, and folks could build you a very solid house or take care of many of your basic health care needs)
posted by eviemath at 5:31 AM on August 9 [27 favorites]

Somewhere inside Google, there's a bug report from a QA person who was doing performance testing a decade ago saying that Sheets breaks down at a certain number of collaborators and that bug report is almost certainly marked "WONTFIX" and "works as designed".
posted by octothorpe at 7:02 AM on August 9 [16 favorites]

It's a hard problem. It took the database industry decades to solve. If it were a priority, I'm sure Google could as well, but it would be expensive and I can't imagine they'd get much in return.
posted by CheeseDigestsAll at 7:30 AM on August 9 [1 favorite]

I work with a crop of coop students in a STEM lab setting. We get a pretty good slice of the top of the upper year science and engineering students from a few universities. Of the dozen or so we see in a year, perhaps 2 or 3 are willing to learn how to do scripting in python, vba or matlab or some of the more custom programming languages we use.

This number has not noticeably changed since I was a student. There are a few kids who dive into it, but most won't unless they have to.

"Digital native" kids can use the latest SM clients, but most don't go much deeper than that.
posted by bonehead at 7:43 AM on August 9 [5 favorites]

Unfortunately there is a long history of people trying to use spreadsheets in a manner for which they are not designed. Throwing collaboration into the mix probably makes the situation even worse. I would have looked first for something that had low overhead, no coding and a universal distribution. Perhaps some combination of email, a listserv and an online presence and archive. Google Groups might be a place to start...
posted by jim in austin at 8:10 AM on August 9 [2 favorites]

Massive concurrent collaboration is easy if you don't need it to be usable (and the definition of "usable" here is very low and one that many/most would still find unusable). It's super duper duper hard if you need it to work well and be intuitive in any way. The best anyone has been able to do is Wikis and then, subsequently, the

As the article notes, the problems here are social - too many people editing the same thing means that there is too much toe-stepping. We know how to solve this problem with code: we use git and GitHub. We know how to solve this problem with databases: we use transactions. We don't have any idea how users even conceptualize what a correct solution is to these problems in the realm of WYSIWIG documents and spreadsheets, much less how to build it.

Imagining that this kind of problem is an "eager undergrad in a weekend" problem feels like deeply wishful thinking.

(full disclosure: I do happen to work for Google, but not in this area or anywhere near it, and my opinions are my own and not theirs)
posted by pmb at 8:19 AM on August 9 [18 favorites]

Yeah, this is a hard-ish problem.
I often say about things 'oh, I could code that in a weekend'.
I couldn't do this in a weekend.
posted by signal at 9:50 AM on August 9 [4 favorites]

“Google Docs has shown us that things that look like documents are compelling,” Saperia said when describing the advantages of Google Docs. “People understand them and can edit them without too many problems.”

We don’t really understand them, as a group. Mostly this leads to formatting errors which are not usually as bad as spreadsheet errors, but toe-stepping sure is. And even “digital natives” step on their own toes when copying out of documents - different toes bruised by linked or unlinked copies, but trip hazards all the same.

Eh. That felt like a vapor ware ad for something that could be great but will need so much user design and testing. Also made me sentimental about the tale of users flooding into Pinboard, I should go compare user base sizes.
posted by clew at 10:43 AM on August 9 [2 favorites]

We know how to solve this problem with code: we use git and GitHub

Can you provide more detail why the version control in Github isn't applicable here?
posted by Reasonably Everything Happens at 10:45 AM on August 9

I think pmb pointed out why github isn't applicable here: "We don't have any idea how users even conceptualize what a correct solution is to these problems in the realm of WYSIWIG documents and spreadsheets, much less how to build it."

Github doesn't allow for real-time WYSIWIG editing, which is presumably what someone who thinks Google Sheets or Google Docs is the answer for would want. Using git instead would require some level of training of even technically literate non-coders, and also a bit of a paradigm shift in terms of how they think of the steps in "editing a document".
posted by grae at 11:39 AM on August 9 [6 favorites]

Git, along with all the other version control systems aimed at programmers that I've ever seen, are also the wrong tool for this. Critically, in code, the programmer isn't typically staring at a single line of code in isolation, and changes to the lines above and below will have an effect on that line of code. There is no programmer workflow that involves 100 or 1000s of programmers editing a single file concurrently. Version control systems "cheats" by making a local copy to use (sometimes called "checkout", using a library metaphor), and then there's some sort of social organization for using the tools to merge those changes back together, but that's not collaborative editing. VScode has nascent support for 2 programmers to collaborate concurrently but that's a long, long way from hundreds.

Thing is, for data entry and viewing and searching like is being asked for here, you don't need truly collaborative editing as enabled by modern applications, and a basic data entry web app (known as a "CRUD" app in the industry) would be sufficient. Website performance is measured in requests per second, sometimes up in the thousands (the largest sites on the Internet do even more). In this conversation, however, we're talking about 100's of users. In order to scale to 100's of users, each making a single request every few minutes, the system would need to support approximately... 2 requests per second. A system hosted using your laptop as a server could manage that.

Collaborative editing is hard because it scales exponentially with the number of users, even for something as simple as moving the cursor, since updates need to be sent to everyone, which makes the RPS requirement dramatically higher for no benefit for data entry, once the columns have stabilized. Hence a separate data-entry page as a workaround.
posted by fragmede at 11:58 AM on August 9 [10 favorites]

I think most people would understand a turn based editing model, with an accessible history, better than git’s forking and merging and (sweet Fates) rebasing.

Forms suggest that, since we’re used to them being the UI for database transactions, but if they’re really transactional users will be locking each other out.
posted by clew at 12:02 PM on August 9 [1 favorite]

I think jim_in_austin nails it. The use cases in the article are thoroughly solved - just not with shiny branded products that get talked about by university tech resource centers.

First scenario: matching students in need to students who can help. Use Google Groups as jim_in_austin suggested, or sign up for a hosted forum service like it's 1992. Students in need post, students who can help reply. Admins bump unanswered posts every now and then.

Second scenario: crowdsourced handbook. Make a wiki. I'd argue that wikis are the *only* example of massively concurrent collaboration that humans have actually made work. Using the same mode for reading and editing for a long term document sounds like a great way to burn out your moderators as they chase down and revert every accidental keystroke edit. Saperia looks like an old-time Wikipedian, so I'd really be interested in hearing why he thought it wouldn't work.

On preview: yeah, what fragmede said. Except I'd amend the "no programmer workflow that involves 100 or 1000s of programmers editing a single file concurrently" to all human workflows.
posted by McBearclaw at 12:04 PM on August 9 [4 favorites]

I think these things always end up as spreadsheets because someone eventually asks "is there any way to get this into a spreadsheet?" and so you might as well have it as one to begin with.

Back in the day, personal database systems like Foxpro, Clarion, Filemaker, Access, etc. were as popular as spreadsheets. Airtable looks like the modern incarnation of these tools, but it's priced per-user. Where's the thing between spreadsheet and LAMP-ish custom website, designed for humans (and free and open source and peer-to-peer and completely secure and private and...)
posted by RobotVoodooPower at 12:05 PM on August 9 [3 favorites]

Also, now I’m wondering if people understand that things like seat-reservation form submission timeouts are preventing toe-stepping, or if it just feels like a sales push.
posted by clew at 12:06 PM on August 9 [1 favorite]

I do feel like there's a proper, kind-of-general "solution" waiting out there that sits between "custom-code this exact use case" and "dump it in a spreadsheet". No one's really nailed that space yet, though. Microsoft tried it with Microsoft Access. Lotus Notes was an attempt at this, which was excellent, but REALLY didn't scale well.

In current tech, Airtable looks like it's made the "better-than-a-spreadsheet spreadsheet" that analysts have been craving forever. People are using Notion to combine an advanced note-taking tool with tables and some limited custom business rules. Unqork claims that their product can replace every custom enterprise app with a "no-code" solution. They might be claiming a bit ahead of what they can do. In all these cases, there are "reasonable" limits - the target audience is for limited numbers of collaborators, which is easy to scale. They've solved the "easy part" of the hard problem.

But, either way, there hasn't been a "killer app" yet that can make people think of using it before busting out a spreadsheet.
posted by Citrus at 12:24 PM on August 9 [1 favorite]

“Google Docs has shown us that things that look like documents are compelling,” Saperia said when describing the advantages of Google Docs. “People understand them and can edit them without too many problems.”

Hammers have shown us that things that look like nails are compelling. People understand them and can hit them without too many problems.

The fact that you're unlikely to enjoy the result of over a hundred other people trying to hammer the same nail as you at the same time as you reveals, to my way of thinking, neither defects in nails nor design flaws in hammers.
posted by flabdablet at 1:14 PM on August 9 [8 favorites]

They might be claiming a bit ahead of what they can do

Software marketing folks making unrealistic claims that utterly ignore current or indeed reasonable future engineering? You shock me.
posted by flabdablet at 1:16 PM on August 9 [1 favorite]

There's not a single general solution because there are like ten problems being conflated here. Asking for exactly one solution that covers everything magically is kinda ridiculous. You end up with something that kinda-sorta works in most cases and breaks in unfortunate ways at the edges... Which is exactly what we've got!

(Caution: The following comment is full of spicy takes.)

Here's the main activist use cases that I've seen in action:
* We have N canvassers/phone bankers/etc collecting similar structured data, and we need it in a central, validated form at the end of the day.
* We want to connect an undifferentiated mass of people with needs to a similarly undifferentiated mass of people with supplies.
* My subcommittee is writing a press release or other doc.
(feel free to suggest more!)

The first case needs some kind of forms with good data validation. The Google Form + Sheet approach is pretty good here, but inevitably burns untold hours of volunteer time on data cleaning after the fact anyway.

The second case seems to be what the article is focused on. This is a kinda weird space where it's not at all clear to me that the 'anything goes' approach is better than something a bit more structured. It's nice to SEE that there's lots of activity going on, which an overloaded sheet is GREAT for, but that you'd lose on entering stuff in a random form and waiting for someone to connect you with someone else. (Side note: every one of these things that I've seen are COMPLETELY HORRIFYING from a privacy and security standpoint. "My name is secret, but my phone number is $PHONE. Here are three medical conditions I have, and, also, I would also like to bomb the white house. Look forward to hearing from you!")

The last use case is /pretty-well/ served by Google Docs, with the caveat that seemingly every organization has a bunch of people who REALLY want to die on the hill of sending microsoft docs back and forth with increasingly convoluted filenames. (Using random non-secure email accounts, of course...) I've definitely seen cases where something not as free-wheeling as a collaborative doc but not as heavy as Git would be awesome; change revisions and code review really are a useful analogy. We want a draft, a way to suggest revisions to the draft, accept and reject revisions, and have meaningful 'checkpoints' along the way. You can totally do this in google docs, but the check-pointed history thing is just a bit too loosey goosey and hidden away to really be useful.
posted by kaibutsu at 2:28 PM on August 9 [7 favorites]

Just like the many of social solutions proposed on MeFi skip over the “...and then the streets ran 3ft deep with blood...” portion of the result, the technological solutions seem to leave out the “...and then the Brinks trucks full of cash started backing up to my front door...” part that would surely occur if you could solve the problems as easily as proposed.

If you could reliably solve this issue, at scale, in a weekend, send over your resume because my larger team has reqs that pay ~$500k/year in salary/stock. Seriously.
posted by sideshow at 3:09 PM on August 9 [7 favorites]

something not as free-wheeling as a collaborative doc but not as heavy as Git

I have twice been pretty sure that Markdown, subversion, diff/merge would solve problems causing my group much pain. And my descriptions of the constraints didn’t put them off. Looking at plain text did, even before we got to diffing. Someone’s going to do for diff what Slack did for IRC and earn those truckloads of cash.
posted by clew at 3:22 PM on August 9 [2 favorites]

I once worked on a team where we made a CMS for non-technical people. Being engineers, we all thought, well, what if we just made Git, but for documents? Then wrap it in a nice UI? Simple, right?

Our users just did not grasp the Git model. Mostly, they checked out a single branch, then everyone worked on the same branch for eternity. Sometimes there were multiple branches, but then our users would try to do things like merge branches that diverged months ago back into master, and even in actual Git, you're not going to have a good time.

Anyway, the product was sunset before it got wide adoption, and the work of ~10 engineers over 3 years was lost to the abyss.
posted by airmail at 3:37 PM on August 9 [6 favorites]

We want to connect an undifferentiated mass of people with needs to a similarly undifferentiated mass of people with supplies.

Come to think of it, the most well-known example of this use case is... dating apps.
posted by airmail at 3:40 PM on August 9 [6 favorites]

The last use case is /pretty-well/ served by Google Docs, with the caveat that seemingly every organization has a bunch of people who REALLY want to die on the hill of sending microsoft docs back and forth with increasingly convoluted filenames.

Christ, this brings back traumas.
posted by Merus at 4:32 PM on August 9 [1 favorite]

For those of you interested in the software side, Audrey Tang's insights on the development of SocialCalc and Ethercalc are fascinating reading. A shared spreadsheet is, behind the scenes, basically a chatroom.
posted by clawsoon at 4:48 PM on August 9 [2 favorites]

Paywall bounces me out now.
posted by doctornemo at 4:50 PM on August 9

But a bunch of us managed to break Google Sheets in a way this March, when we crowdsourced data about colleges and universities responding to COVID-10. Ultimately had to shut it down.
posted by doctornemo at 4:51 PM on August 9

(feel free to suggest more!)

* We want to collect a list of resources (eg. an anti-racist reading list) that many people may add to.

Differs from your case 2 in that there is no back-and-forth between sources and users of info. I've mostly seen this done in Docs, but it's really a database application.

Even the back-and-forth version (an example where anonymity is less important would be coordinating mutual aid offers and asks) should be a database.

But, as other folks have mentioned, there are no small, WYSIWYG, easily-learnable personal database apps nowadays. Databases are instead aimed at use cases with extremely large datasets, where you're unlikely to have lots of casual users entering records individually by hand or where the scale is such that you would just hire a programmer to write the specific web interface for your SQL database, and where you won't have many individual users writing their own single or few use case queries. You can sort of do this with Forms, but that is a little too clunky if someone wants to enter, say, five records. And that doesn't help you with the queries; you'd need someone who knew what they were doing more than the casual Google Apps user to set something up in the background.

On top of (and related to) all of that, kids these days don't know what databases are - or even that they exist. No, really: I had a reasonably technically literate student - competent LaTeX user, had done a little bit of programming, though hadn't studied computer science formally - who had to do a thing that was a classic database sort of application, and was asking my advice on how to set it up. And I said, "oh, you should use a database for that." And the student said, "what's a database?" Because we don't have personal database programs anymore, just the big, complicated ones, they're now an advanced topic as far as computer literacy skills go.
posted by eviemath at 5:34 PM on August 9 [7 favorites]

Paywall bounces me out now.

Perhaps this link may work:
posted by cynical pinnacle at 6:12 PM on August 9 [1 favorite]

Microsoft tried it with Microsoft Access.

Which I use all the time. The thing is, though, your average non-nerd does not have any idea about how relational databases work or what table normalization is. You're speaking alien language if you try to talk about it. Most people, even if they knew how to mechanically make a table in Access, make a massive one-table-hundred-columns kitchen sink database and don't understand why they would use this unfriendly interface when they could make the same table in Excel in 30 seconds.

You can only get them to use a database if you design an intuitive front end for it as a custom app.
posted by ctmf at 6:28 PM on August 9 [4 favorites]

Use cases such as this demand immediate availability. Someone sees a need, they aren't going to wait a few days for someone to sic Python on it. For better or for worse, Google apps are where we're at right now. (I'm highly tech literate, as both a hobbyist and a professional of thirty years. I would run screaming if someone expected me to code something, even given a timeline of weeks.)

I also think something was missed in the responses to the original Github comment. I don't think it's that Git has anything to do with this, but that the people who created Github solved the scaling problem for their use case, which is thousands of users pushing and pulling millions lines of code simultaneously, then adding web pages and pipelining to it.
posted by lhauser at 7:04 PM on August 9 [2 favorites]

I’ve been thinking about how often activists would want to know who asserted something, and whether they still seemed credible. (≈Surely this≈ will inspire people to manage a web of trust!)
posted by clew at 7:27 PM on August 9 [2 favorites]

When people are saying the social scaling problem is more limiting than the technical scaling problem, I do see a lot of places that's true. Operational transformation technique can support more simultaneous editors on a piece of writing than you want editing a piece of writing.

But the piece did describe technical limits, like "the thing doesn't load." Or max 100 editors on a spreadsheet. That's a real limitation; spreadsheet editing can be made to work with a lot of people, if it's the right kind of spreadsheet for it, and if the creator knows how to protect some areas, how to use validation rules, and how to structure the editable parts. (Tip: if people are entering rows, have a column for name/handle first. If two people start simultaneously editing, one will notice their name disappearing.)
posted by away for regrooving at 10:29 PM on August 9

I do think there are possibilities for designing software that's both broadly familiar and usable, and paves over some of the common pitfalls.

What if you could come into the spreadsheet in "row-adding" mode (maybe you don't have full edit permission, maybe you do but the URL sets this mode). In this mode, you never clash with someone else adding a row, you add independently (in an order you don't control, by timestamp). You can see other people's edits (with lag from batching) or turn that off for lower bandwidth, but you can only edit your own rows.

Basically this is getting the desired effects of the form-into-sheet, while still looking like a sheet and mostly acting like one. You can edit, you can copy-and-paste. You can't use formulas with cell references out of the row, but probably most of these aren't using formulas much.
posted by away for regrooving at 10:40 PM on August 9

We could go back to the old way of mailing around an attached Office document and then having ten different people edit copies and then mail them around again with names like customer_data_bob_v12_merged_with_jane_v5.xlsx.
posted by octothorpe at 9:49 AM on August 10

Thank you, cynical pinnacle!
posted by doctornemo at 10:31 AM on August 10

« Older Dickens, Copyright and the US Civil War   |   "Tenderness ... shows the world as being alive... Newer »

This thread has been archived and is closed to new comments