A personal Dewey Decimal system
November 23, 2021 2:56 PM   Subscribe

Johnny•Decimal helps organize your (digital) life. Thousands of emails. Hundreds of files. File structures created on a whim and six layers deep. Duplicated content, lost content. We thought search would save us from this nightmare, but we were wrong. It's simple in practice, but takes some thought to build the right structure for yourself. And that's the beauty of it - the structure is what best suits you and how you naturally group things.
posted by jpeacock (91 comments total) 44 users marked this as a favorite
 
Confused and sad ftw.
posted by dobbs at 3:12 PM on November 23, 2021 [6 favorites]


Search > Index
posted by ChurchHatesTucker at 3:20 PM on November 23, 2021 [5 favorites]


I recoil at this concept. I feel like I'm losing my mind the further along I read.
posted by tiny frying pan at 3:21 PM on November 23, 2021 [18 favorites]


Who are these people for whom every digital or physical asset sits neatly within a single category, though?

“Why don’t we have access to [x]? I looked in the [policy.security-and-access] folder and couldn’t find anything?”
“Oh, it’s because granting permission to [x] is expensive, as detailed in [finance.purchasing-justifications]”
posted by parm at 3:23 PM on November 23, 2021 [23 favorites]


I am delighted if this works for anyone but every time I see it it just feels like so many round holes to try and hammer the square, triangular, round, and other-shaped pegs of my files into. Especially the "nothing more than two subfolders deep", my comics projects have a lot of parts that I'm much happier dealing with when they're nicely organized in subfolders.

I also like to make liberal use of aliases to make it easy to get to stuff I'm currently working on, I just dump those in the root of my main "working" directory. If they're current projects the alias gets a space at the front of the filename, and probably a shortcut in the Finder's sidebar; removing the space and deleting the Finder shortcut carries a nice bit of ceremonial weight after working on something for a few years straight!
posted by egypturnash at 3:24 PM on November 23, 2021 [9 favorites]


I love this! Thanks.
posted by dusty potato at 3:31 PM on November 23, 2021 [2 favorites]


I wanted to like this, because it seems so simple and I like things to be organised. But there seems to be a fatal flaw that I can't get over:
In these examples, 42.18 is the 18th thing you’ve saved in your 42 category.
This means you can't have more than 99 item in any category (because the schema limits you to two digits after the decimal).

I agree that the idea of relying on search functions alone to find files generally doesn't work, because it relies on people naming things in a logical way and that doesn't happen. My experience is that relying on strict naming conventions gives you the best chance of being able to find things by search than anything else.
posted by dg at 3:32 PM on November 23, 2021 [5 favorites]


There’s only one place anything can ever be
I am old enough to have used many physical file systems, including decimal ones, and I've come to conclude that computer file systems fundamentally break the metaphor. A physical file system is a single-level hierarchy; there are boxes or shelves, and then there are files or folders, but they don't nest, and they certainly don't nest infinitely. Within a file you could have dividers, or tabs, or plenty of other things, but a computer file system is so much more powerful than this it really should be thought of as a different system not comparable to physical files. There was only one place anything could ever be, but that was before things were infinitely copyable, and there were shorthand systems for numbering information at a human-readable level, but that was before we had incredibly powerful machines that are really good at sorting information fast.

This system gives the game away by referring to indexes which, again, were essential parts of any large system: the library card index was just the best of them; indexes were as varied and precious as the systems they indexed. But computer file systems already have indexes far better and faster than humans can make, accessible in the form of sorts and limits and searches! Note that the place most white-collar workers store their most important work is also the technology with the most advanced internal search function: not a file system, no, I'm talking about email.

(The problems creep in where you get a filesystem full of filenames like file_draft_REVISED-version2_withfiascocomments.docx~old_deleteme. But that's, uh, a different problem).
posted by Fiasco da Gama at 3:36 PM on November 23, 2021 [13 favorites]


If a document says deleteme, it is deleted by me.

For some file structures, folders by year work; in many cases, 10 year old files can be pruned savagely.

I save stuff from twitter and fb, edit images to get rid of cruft. Part of my file organization is that most stuff gets saved to Incoming, because you can let Windows have much say over where to put files, or you're even more screwed than usual.

Most of everybody has 0 sense of taxonomy, organization, logic, sense, so all systems fail pretty hard.
posted by theora55 at 3:41 PM on November 23, 2021 [4 favorites]


... the place most white-collar workers store their most important work is also the technology with the most advanced internal search function: not a file system, no, I'm talking about email.
A system that completely fails when a person leaves the organisation and every single record in their mailbox is lost.

(The problems creep in where you get a filesystem full of filenames like file_draft_REVISED-version2_withfiascocomments.docx~old_deleteme.
*runs screaming from the room*
posted by dg at 3:50 PM on November 23, 2021


None of the areas or categories overlap. There’s only one place anything can ever be.

This is both unintentionally (I hope) hilarious and deeply triggering for me.
posted by bumpkin at 3:51 PM on November 23, 2021 [13 favorites]


I agree, in principle. In practice, I'm deeply skeptical. (But, it's interesting to think about and I'm happy to have read it.)

Even my current system of descriptively named directories fails as often as it succeeds. Do I look for photos of an experimental setup in "talks," 'cause I took it just before a talk, or in "group," 'cause it's part of a multicollaboration effort, or in "[particular experiment]" 'cause the specific device is paid for by a specific project, or in "[student name]" because they used it in a paper, or in "temp" 'cause I was in a hurry. . . I'm not sure adding an additional lookup table to translate that into numbers actually helps. It just adds an additional step and makes things less transparent.

Also, I work with a lot of people who are not neurotypical and are pretty good at adapting to strange behavior. But, adding indices to your email subjects is a pretty advanced level of weirdness. I'm not sure I'm up to figuring out how to categories a response to "Re: 8.32 Re: 2.34: Re: 11.1: Re: 1.13: Re: 7.9 let's meet next week"

Someone other than me needs to figure out how to make command line tagging work without fragile dependencies and GUI junk. Until then, or until we get Star Trek AI, it's very long file names and constant use of "locate." Which is bad, but less bad than any other option I know about.
posted by eotvos at 3:55 PM on November 23, 2021 [1 favorite]


C'mon folks, this isn't that hard. Just follow the Abehammerb method. Keep thousands of tiny tabs and windows open until you slowly lose your mind.
posted by Abehammerb Lincoln at 4:04 PM on November 23, 2021 [19 favorites]


I've already lost my mind. Is it in 13.21 or 21.04 ?
posted by bumpkin at 4:07 PM on November 23, 2021 [12 favorites]


This sounds like a form of Zettelkasten. Which is very hot right now.
posted by teh_boy at 4:08 PM on November 23, 2021 [2 favorites]


good lord, this isn't even how the actual Dewey Decimal System works.
posted by The demon that lives in the air at 4:38 PM on November 23, 2021 [15 favorites]


i have decided this is satire
posted by glonous keming at 4:47 PM on November 23, 2021 [20 favorites]


I'm not going to do it, but I bet it works as well as whatever it is I do do (much of which depends on annotating everything that doesn't innately have a date, and searching for things on disk by date of "when I probably started working on this").

Also, big 1890s-1920s Universal System energy, very quaint.
posted by clew at 4:49 PM on November 23, 2021 [4 favorites]


eotvos: I'm not sure I'm up to figuring out how to categories a response to "Re: 8.32 Re: 2.34: Re: 11.1: Re: 1.13: Re: 7.9 let's meet next week"

What, you forgot your Geekcode?
posted by wenestvedt at 4:54 PM on November 23, 2021 [6 favorites]


This is a basic records management strategy, useful for administrative records if you add some things like naming conventions, and some protocols around email retention or archive requirements. Operational records systems which deal with project based documents often need descriptive strategies beyond numbered buckets.
One example
https://www.saskarchives.com/sites/default/files/pdf/517_arms2014_links_jan302019.pdf

Who are these people for whom every digital or physical asset sits neatly within a single category, though?

Archivists and records managers! :)
posted by gulfofaraby at 4:58 PM on November 23, 2021 [5 favorites]


This lost me at step one and every subsequent step was also ridiculous to me. That said, I do have a mentality of trying to organize my file system, but gradually am coming around to search as a faster way to find files. The result so far is that I end up putting both categories and keywords in my file names now. This kind of helps me find the few things I am actively working on, but also random stuff that I have archived but need again much later.
posted by snofoam at 5:11 PM on November 23, 2021


Full-text search, especially, has been the killer application for me - my workplace uses Google Drive for everything, and it really is a game changer to be able to just remember whichever key words are relevant to what I'm working on.
posted by sagc at 5:14 PM on November 23, 2021 [6 favorites]


I would love to have the sort of life where my problems never overlap, but alas, the damn things do.
posted by brook horse at 5:16 PM on November 23, 2021 [5 favorites]


This is madness. Replacing a series of self evident words with a numerical index you have to memorize?!
posted by JDHarper at 5:23 PM on November 23, 2021 [8 favorites]


To me this kind of is hitting the nail on the head, but hitting lots of other unnecessary crap around the nail as well.

The details of the actual organizational scheme are the last 10%. The first 90% are the habits:
- Save it in the "right" place the first time. If there isn't a right place yet then figure out how to extend your scheme right now. No temporary dumping grounds.
- Give it a very specific name the first time. Not report.ppt but project-XYZ-prototype-report-13-Aug-2019.ppt.

Adopting this scheme does force those habits, but surely the numerical scheme itself is overkill for most people's personal stuff...
posted by equalpants at 5:29 PM on November 23, 2021 [8 favorites]


Anything of mine important and not too private is in GitHub
posted by sjswitzer at 6:05 PM on November 23, 2021 [1 favorite]


This is a masterful trolling by Johny, whoever the hell he is. I was deeply skeptical at the arbitrary division of all the things into exactly 100 buckets. I started to suspect something was up when he got to the part about using 3-6 digit identifiers as mnemonics to help you, presumably a human being who is no better at remembering arbitrary 6-digit numbers than you are at remembering where the hell you put that email from Aunt Marie with the recipe in it. I knew the game was up when I got to the part where you put your ridiculous 6 digit identifiers in EMAIL SUBJECTS TO OTHER PEOPLE, who one assumes would see your messages in their inboxes and mentally assign you to whatever arcane category identifier they use to denote people whose emails they refuse to read.
posted by Mayor West at 6:08 PM on November 23, 2021 [16 favorites]


Wait i just reread my comment and realized I used the word "mnemonic" to describe his ridiculous category titles, and then I remembered this dude's first name, and then, too late, I understood.
posted by Mayor West at 6:12 PM on November 23, 2021 [9 favorites]


Even in the days of paper records, people used “aliases”—you’d find a file folder for the Smith account, and it would contain a single piece of paper reading “See Jones account.” There was a huge amount of work in checking files in and out, archiving them, etc.

I think if you have no organizational system, anything is an improvement, and this has the virtue of being easy to understand and not intimidating in scope. The trick is in the follow-through.
posted by adamrice at 6:18 PM on November 23, 2021 [6 favorites]


I like the efforts made to have it be not too wide and not too deep. The bits where he broke his own rules in either case were amusing and made enough sense.
posted by clew at 6:36 PM on November 23, 2021 [1 favorite]


In library school, I took an interest in computer-assisted classification. I ultimately came up with a system based on binimals. A document comes into the system and based on textual analysis is classed between two other documents. The number assigned to the new document is the binimal that lies between the two documents that lie on either side of it. For example if Doc1 is binimal .1 and Doc2 is binimal .01, Doc3 placed between them is binimal .011. This is for the purely linear sequence. For relational networks, the idea was to use something like Infranodus.
posted by No Robots at 7:31 PM on November 23, 2021


I'm fascinated by how negative the reactions are! The method didn't strike me as at all radical when I read it; it seems like a pretty basic variant on organizing files into folders by topic, no? As an ADHD person this type of maybe arbitrary but ultimately highly delineated method appeals to me a whole lot because it strips out the steps of looking for something that are likely to side-track me or trigger an emotional response.
posted by dusty potato at 7:45 PM on November 23, 2021 [11 favorites]


Feels a little similar to the backlash of folks calling Marie Kondo a "monster" etc for coming up with a strict and slightly eccentric organizational method that plays to some people's needs...
posted by dusty potato at 7:48 PM on November 23, 2021 [3 favorites]


Zettelkasten, was sure this was going to be just a good joke. ya ya, made me look (made me goog) ... It's a thing. But not sure what category of thing.
posted by sammyo at 7:48 PM on November 23, 2021 [1 favorite]


maybe, maybe this is written in earnest but as i said above i do not think it is. maybe some peoples' brain work naturally like this, and more power to them, but if the general populace could do this we wouldn't need DNS would we, fellow 54.244.168.112ians?
posted by glonous keming at 7:55 PM on November 23, 2021 [2 favorites]


This is a basic records management strategy, useful for administrative records if you add some things like naming conventions, and some protocols around email retention or archive requirements.

And as we all know, government administrative offices are definitely the place we put things we want to find and retrieve later.

I am old enough to have used many physical file systems, including decimal ones, and I've come to conclude that computer file systems fundamentally break the metaphor. A physical file system is a single-level hierarchy; there are boxes or shelves, and then there are files or folders, but they don't nest, and they certainly don't nest infinitely.

The filing system itself isn't even that good. Every year I get a ton of 1099 forms and expenses. Do I put them in the folders associated with statements for that income / expense, or in a separate 'taxes 2021' folder, since I need them primarily to file taxes? Ideally it goes in both. This whole 'there's only one place for a thing' is fundamentally broken. What you need is close to a 'filesystem as tags' approach taken by gmail and later tools. And then let us pray we never accumulate so many tags that we need a better file system for tags.
posted by pwnguin at 8:20 PM on November 23, 2021 [3 favorites]


if the general populace could do this we wouldn't need DNS would we, fellow 54.244.168.112ians?

Well, no. We would still need DNS as a key part of Universal Resource Identifiers; otherwise we'd have to redo all our bookmarks the day mefi migrated to AWS.
posted by pwnguin at 8:22 PM on November 23, 2021


The problem with being strict about "no more than 2 layers deep" is if you have a lot of things in a subfolder you'll easily waste more time going through them to find what you want than if you had made an extra layers of subfolders.
posted by juv3nal at 8:23 PM on November 23, 2021 [2 favorites]


Yeah, I see this with a lot of software nerds. They're convinced that the tedious hierarchy/artifice of pure hubris they've created AND TOTALLY MAKES SENSE TO THEM is the one true categorization and now inflict it on all their coworkers.

They haven't organized anything. They've only invented a new bureaucracy.
posted by AlSweigart at 8:42 PM on November 23, 2021 [2 favorites]




Note that the place most white-collar workers store their most important work is also the technology with the most advanced internal search function: not a file system, no, I'm talking about email.

I have come around to an "everything is email" storage and filing system for my personal archive.

At least for me, it has some significant advantages over just keeping everything as individual files on the computer's native filesystem. Most filesystems don't offer the metadata that email has, and if they do it's typically not portable from one system to another. (E.g. the Mac HFS filesystem has the concept of 'resource forks', which can store almost arbitrarily complex metadata, but things get ugly real fast if you copy a file with one to a non-HFS volume.) Despite everyone basically understanding that metadata is an important thing, no decent cross-platform standard for file metadata has emerged at the filesystem level. Shame, really.

However: the MIME RFCs provide a cross-platform, future-proof, human- and machine-readable way of encapsulating and adding metadata to arbitrary documents. Once you have something encapsulated in MIME (as an ".eml" file, typically), you can store it as-is, you can send it to someone, you can put it into a local email store, you can push it up to cloud storage via IMAP, you can search it by metadata or content, you can encrypt or sign it, whatever. And it really wouldn't be hard for operating systems to provide visibility into MIME metadata from the Finder / Explorer if they wanted to, much like how Windows Explorer used to (still does?) provide columns for ID3 tag data when browsing music files.

Over the years, I've made a variety of converters to wrap various things into pseudo-messages that I can store in a mailbox and browse/search with an email client. (E.g. this is one for Metafilter comments.) Currently I'm working through my dead-computer backups, converting various flavors of instant message logs into email messages. This was inspired by the unpleasant realization a few months back, that several years worth of old iChat logs had become entirely unreadable, stuck in a binary format that Apple has since abandoned. (Nice going, guys.)

It works well for structured, repetitive documents (blog posts, recipes, forum posts, business correspondence, legal filings, chat logs, diary entries, etc.), the advantage is less for totally unstructured text.

Even in the days of paper records, people used “aliases”—you’d find a file folder for the Smith account, and it would contain a single piece of paper reading “See Jones account.” There was a huge amount of work in checking files in and out, archiving them, etc.

At one point in my career, I did business process automation. Specifically, I designed systems to replace paper document workflows in document-centric industries like insurance claims processing or retail banking. One of the fun parts of the job was going into an office and just following a paper file through the entire process, watching who did what with it, how they decided who to send it to next, etc. Paper workflows can be incredibly complex, and there is (well, was) a huge amount of accumulated knowledge and praxis about how to handle paper documents. You can spend an awful lot of money on software, just to replicate the experience of a file folder and inter-office mail envelope. More than once I had to look someone dead in the eyes and tell them their multi-million dollar investment in state-of-the-art workflow management software just couldn't replicate the same process they'd been doing for years with color-coded Post-It notes.
posted by Kadin2048 at 8:51 PM on November 23, 2021 [5 favorites]


Search > Index

OK Zoomer
posted by MrGuilt at 8:55 PM on November 23, 2021 [5 favorites]


Sooner or later, as taxonomy is discussed, someone must bring up the classification used by the Celestial Emporium of Benvolent Knowledge, I guess it might as well be me.

Those that belong to the emperor
Embalmed ones
Those that are trained
Sucking pigs
Mermaids (or Sirens)
Fabulous ones
Stray dogs
Those that are included in this classification
Those that tremble as if they were mad
Innumerable ones
Those drawn with a very fine camel hair brush
Et cetera
Those that have just broken the flower vase
Those that, at a distance, resemble flies

A couple of decades in IT have convinced me that actually, this better resembles the real world than the the kind of taxonomy one is taught to produce as a professional.

The small sum I spent on the collected short stories of Borges was very well-spent, by the way.
posted by i_am_joe's_spleen at 8:59 PM on November 23, 2021 [22 favorites]


yanno.

What we need is a file system based off tags. And you could start, sure, with tags from existing directory structures. So a file in Accounting -> Acct5201 -> Debts would be tagged with all three of those. But why not also let it be tagged "Bob's Project", or "Closed Account" or whatever?

Search is better than indexing by far, but **ESPECIALLY** when you have some easily searched tags to latch onto.

Instead we get a file storage metaphor based on actual physical paper files, folders, and drawers and it largely sucks. It's hard to find stuff, stuff gets lost, bleh.

But I can't seem to find a tag based document storage system anywhere. It seems so obvious, and I guess I'll have to code my own if I want it to happen.
posted by sotonohito at 9:04 PM on November 23, 2021 [2 favorites]


OK Zoomer
Gradually, Garland came to the same realization that many of her fellow educators have reached in the past four years: the concept of file folders and directories, essential to previous generations’ understanding of computers, is gibberish to many modern students.
Which is exactly why those of us whose initial reaction to the iPhone was visceral horror have been jumping up and down and warning all and sundry about this since it first came out.

It used to be that "Where did you save it?" "In Word" was a conversation that required thirty minutes of gentle education in order to save the confused from a lifetime of staying confused. Then iOS came out and made everything literally work that way and, well, here we are.

It has to be said, though, that the rot had begun to set in before iOS. Windows needs to take a fair bit of the blame as well, with its disconnection of Desktop and My Computer and My Documents and My Pictures and so forth from the filesystem hierarchy. But it's still mostly Jobs's fault. Fucking marketroids ruin everything.
posted by flabdablet at 9:05 PM on November 23, 2021 [10 favorites]


"Now we have ten areas which contain ten categories each. That’s a hundred categories at the very most. It’s very unlikely you will end up with a hundred categories."

Oh you sweet, summer child.
posted by AlSweigart at 9:13 PM on November 23, 2021 [16 favorites]


flabdablet The problem isn't with iOS, or at least not specifically, and really isn't a bad thing.

Yes, clearly its good for people to know that their documents actually reside somewhere in actual physical media rather than residing in the app they use.

But is that really necessary for everyday use of documents? Unless you're a major geek like me you won't be doing anything with Word files except opening them in Word. So why not integrate that better into Word?

I will also note that all of our file organization systems are metaphors. There's no such thing as directory structure in an actual hard disk, it's all physical location on the platter of individual segments. We impose the directory metaphor because it was, at the time it was developed, a useful metaphor.

But it's just a metaphor. We can just as easily us other metaphors. And the directory model is limiting, you have to puzzle out where person X might put something into the file structure because often things like that are ambiguous. Sure, with symlinks you can have files sort of available in all the places you might think they belong, but it's still clumsy.

Since we're not trying to sell computing to people from the 1940's who are really devoted to physical analogues why not move away from that and to a different model?

I'll agree that "file X can only be accessed through app Y and no we will **NOT** provide you with an actual file browser" is a bad model. But just because that one is bad doesn't mean the current one is really all that great.
posted by sotonohito at 9:14 PM on November 23, 2021 [10 favorites]


But I can't seem to find a tag based document storage system anywhere.

I've made one, the problem is that either you have a space amenable to a small number of useful (but sometimes overlapping) tags that are pretty static, or you have a system with tag explosion, where one person's $foo is another person's $bar and now you worry that if you find a relevant tag you may be missing some stuff that's actually under a (slightly) different tag... and then you need a system that lets you search tags.
posted by axiom at 9:21 PM on November 23, 2021 [3 favorites]


I started typing a variety of "search versus directory" or "iOS isn't the issue" responses. One draft was "the death of NetWare is the issue," and realized that was almost the mark.

For files on my personal laptop, that only I use an access, it's not a huge deal.

It breaks down the moment I connect to a network and fire up Mail. All of the sudden, I have to figure out where to even begin to look. It could be on my local drive, in my email, in some personal cloud storage, or some shared storage (SharePoint, et al.). Versioning become and issue--I mean, SharePoint, God love it, tries its darndest*, but I'm still not sure if the version I'm checking out is the right one, or there is a more current/active version someone downloaded and passing back and forth in email.

Hell really is other people.

*And I'm not a SharePoint fan
posted by MrGuilt at 9:23 PM on November 23, 2021 [3 favorites]


... replacing a series of self evident words with a numerical index you have to memorize?!

I say this as an erstwhile archivist, librarian, hoarder of digital stuff + someone who contends on a near-daily basis w/vast, poorly-organized heaps of organizational files on both random drives and in content management systems:

If you do nothing else, for the love of god, give your files human-readable names that provide some hint of their content and purpose.

Everything else is brittle and doomed, and will, in time, be lost or hopelessly fouled up, by you or by the next guy.
posted by ryanshepard at 9:26 PM on November 23, 2021 [16 favorites]


So a file in Accounting -> Acct5201 -> Debts would be tagged with all three of those. But why not also let it be tagged "Bob's Project", or "Closed Account" or whatever?

You can do this with symlinks. Each tag you want just becomes a directory, and you tag a file by adding a symlink to it inside that directory.

As a bonus, this scheme also allows you to organize your tags hierarchically if that's your jam. It also doesn't rely in any way on the files' interior formats.

Symlinks are safer than hardlinks for this use case, because a directory full of symlinks so clearly is a tagging artefact and there is still an authoritative spot for all the tagged stuff to live. Going hog-wild with hardlinks would, I think, run a much stronger risk of cross-project data corruption.

just because that one is bad doesn't mean the current one is really all that great

It's not great, I agree. Merely much better than what appears to be in the process of replacing it.

Unless you're a major geek like me you won't be doing anything with Word files except opening them in Word. So why not integrate that better into Word?

For the exact reason that Word then becomes conceptually identical to The Computer, and people simply never learn that there are faster and easier ways to do file maintenance operations like backing stuff up than by going through a painful, one at a time, Open/Save As dance.

Convenience is good and nice, but the modern world has fetishized it to a completely self-defeating extent.

Making stuff that works around the old requirement that using a computer required a little bit of education in fundamental concepts has not, in fact, made computing more democratic. Knowledge is power, and designing systems that deliberately make it harder to know where any particular piece of your data actually resides is absolutely a power grab from the tech priesthood.
posted by flabdablet at 9:26 PM on November 23, 2021 [6 favorites]


What we need is a file system based off tags.

Ultimately that's just longer file names.

Searching inside documents is like accessing ALL THE TAGS!
posted by ChurchHatesTucker at 9:28 PM on November 23, 2021 [3 favorites]


I'm no fan of what Apple's i-devices have done to computing, but there's nothing particularly special or natural about the idea of hierarchical "folders" into which one places "files". That's just one particular metaphor or abstraction for representing data to the user in a comprehensible way.

It's one that's popular to the point of near ubiquity on PCs, but there were and still are other approaches, which use different metaphors and expose data very differently.

IBM had a whole line of machines, beginning with the System/38 and continuing on through the AS/400 (whatever they're calling it this week), which originally lacked a "filesystem" in the DOS or Unix meaning of the term. Instead, the operating system exposed a database to all applications, to which they could write or read data. Instead of thinking about data as "files", you think about it as rows/columns or as "records". It seems unnatural, because we're so used to the file-based metaphor, but it is a pretty good fit for many kinds of business data, and probably seemed quite natural to people used to stacks of punched cards. (It's worth noting that Unix systems typically used linear paper tape, not punched cards, which were more of a mainframe thing.) Database-centric schemes also make it trivial to combine data from multiple applications: you can basically do SQL-like JOINs, create views, etc. from one app's data to another's.

Then there are Object Based Storage systems, where you store and retrieve abstract data objects, based on their attributes or properties, from a giant single-level pool of data. These systems scale well and underlie much of the modern Internet. Some of them will let you do truly bizarre (to a person used to managing files in a directory structure) stuff, like making the "value" of a property on one object be a pointer to a second object, and then have the second object reference the first object again, etc. Again, it seems unnatural and overly complicated to me, but to people who were really high on the idea of object oriented programming in the 1990s, it made sense.

Where things get ugly, is when users or application designers try to shoehorn one metaphor into a system designed for a different one, or layer one on top of another. It happens all the time (and in fact happens every time someone on a Unix-based system spins up MySQL or DB2), but you almost always have to use care where the two abstractions meet.
posted by Kadin2048 at 9:53 PM on November 23, 2021 [3 favorites]


That's just one particular metaphor or abstraction for representing data to the user in a comprehensible way

Sure, and like all metaphors and abstractions it has its strengths and weaknesses and use cases it fits really well and others it fits really poorly and everything in between. Overall, though, the fact that so many other systems have kind of gravitated over time toward supporting some kind of hierarchical name space is a pretty solid clue that they're actually not too bad as these things go.

But I don't think the metaphors actually matter all that much. What does matter in an age of ubiquitous personal digital devices, to my way of thinking, is that people understand that

(a) Every bit of data you care about resides on some storage medium somewhere, and that if you don't actually have any idea where at least two identical copies of your data are stored then you're already half way to losing it forever.

(b) Every bit of data you care about is conceptually separable from the tools you use to acquire and/or create and/or manipulate it. Your Word documents don't exist in Word but on some storage device somewhere; Word is just the particular tool you use to create and manipulate them. Further, you should in general expect to be able to manipulate the same data with some other tool if you find one that suits you better.

(c) There are things that not only can be done but need to be done with digital data that depend not at all on what it represents; at a bare minimum, it's valuable to know how close to full any given storage medium is and how to make an identical copy of data from one device on another, and it's well worth becoming familiar with tools such as file browsers that facilitate these things.

(d) Using domain-specific data creation and manipulation tools such as Word or iTunes for jobs that a general purpose file browser could do many times more easily and reliably and less idiosyncratically, for no better reason than unfamiliarity with your local file browser and its underlying data storage metaphors, will waste incalculable amounts of your valuable time as well as making data loss far more likely.

(e) Systems that provide nothing like a general purpose file browser or (worse) actively impede their implementation are broken by design and best avoided.
posted by flabdablet at 10:49 PM on November 23, 2021 [7 favorites]


Also, the idea of a "general purpose file browser" is completely conceptually separable from that of a user interface. It may well be that rather than such a browser being some specific GUI application, it's a subset of the functionality exposed by something like a textual command shell. The point is that every well designed system has something you can use for manipulating data in a content-agnostic fashion.
posted by flabdablet at 10:57 PM on November 23, 2021 [3 favorites]


Then iOS came out and made everything literally work that way and, well, here we are.

It has to be said, though, that the rot had begun to set in before iOS. Windows needs to take a fair bit of the blame as well, with its disconnection of Desktop and My Computer and My Documents and My Pictures and so forth from the filesystem hierarchy. But it's still mostly Jobs's fault.


Funny thing is somebody at Apple (not necessarily Steve Jobs) clearly loved the desktop metaphor as they held onto the original Mac vision of it through the whole era that Microsoft was fucking around with stuff like that and Frankenstein-ing bits of web browser into the GUI.

Other funny thing is that iOS finally (re)introduced files several major versions ago.
posted by atoxyl at 1:03 AM on November 24, 2021


Having a content management culture is fundamental to running a company that’s not a constant struggle of disorganization. Relying on search and tags doesn’t work well enough, nor does being able to search through documents. There are just too many versions of files and data types (images, charts, graphics, ...) to make it feasible or realistic. Search is good as a shortcut to a file you probably already know the location of anyway, once your system is in place.

Something like a Personal Dewey Decimal System is best* whether you use the numbering scheme or not. This is where the 80/20 rule comes in to play. Your imperfectly well-sorted data is 80%, and the last 20%, where you have less or no influence, you can document or copy/move into your system.

You can draw boundaries around messy unsorted data and files. For example: A client or contractor sends a version of a file via email and the next day on a floppy disk? It gets immediately copied into company/clients/client-x/year-month-thing-version.extension. If you’re using a collaborative web application, a link to the project on the web app get saved in company/projects/project/info.text. And so on.

flabdablet explained in an eloquent way. Files are files. For example: If someone sends me an email with a file in it, I think of it like a physical package that someone sent to my office. The package was delivered, but I can’t just look at it once and leave it there just because my name and the date is on the label. My employees will not be pleased with a giant stack of cardboard boxes by the front door that they can’t touch but need access to. No, the thing needs to be unpacked and put in its proper place.

It’s ok to duplicate files once in a while.

Over time and with occasional improvements, it helps you figure out what processes you’re actually doing at a company. I can see how many revisions it takes to get a result by looking at a project folder. I can see how long the project took by seeing the first and last file, which are all date stamped in the name. It reduces the amount of data wrangling needed to analyze processes and improve them.

I try to look away when someone at another company opens up a bunch of files searching for their whatever thing they want to show me. It’s painful. I think about the rocks, rapids and unexpected branches downstream that the employees/co-workers/secretaries have to navigate.

* absolutely the best way to do things in all circumstances — now get off my lawn.
posted by romanb at 1:37 AM on November 24, 2021 [1 favorite]


The filing system itself isn't even that good. Every year I get a ton of 1099 forms and expenses. Do I put them in the folders associated with statements for that income / expense, or in a separate 'taxes 2021' folder, since I need them primarily to file taxes? Ideally it goes in both.

In the country I live (not US), taxes and expenses are from the same year (2021 taxes are based on 2021 income and 2021 expenses and so on) so I throw everything, including forms, into my 2021 Tax folder, even if the filing is done in 2022. If for some reason there is further correspondence from the tax authority, say in 2023, I don’t want it spread over many years when it all relates to 2021. I recognize there are far more complex scenarios than mine but I find this simpler than tagging things up and allows me to find things on various devices that have access to my cloud storage (not every search algorithm provides the same result and this way I don’t depend on a particular app or operating system to know where that letter from the tax office is):

/finance/2021/tax:
----/expenses/month/expense-x.pdf
----/income/month/invoice-x.pdf
----/forms/...
----/correspondence/2022-statement-from-tax-office.pdf
posted by romanb at 2:12 AM on November 24, 2021


It’s ok to duplicate files once in a while

Necessary, even, if each of the duplicates is going to be subject to independent ongoing modification. But if there is supposed to be an authoritative version at all times, it pays to make sure you know which one that is. The principle of a Single Source of Truth is a valuable one and should not be lightly dispensed with.

Programmers have to deal with this kind of thing daily, which is why version control systems are a thing. But using a version control system requires a concept of data having an existence independent of the tools that created it so solid as to be completely instinctive, which is probably why formal version control systems remain relatively rare outside the programming community.
posted by flabdablet at 3:18 AM on November 24, 2021 [1 favorite]


But again, none of this organizational stuff matters anywhere near as much as always having adequate backups.

The principle I strongly recommend to all and sundry is that the working copy of anything goes in local storage, on media over which the author has full control, before being copied anywhere else. Which it should then be, preferably quickly or at the very least regularly.

So sure, have your company Sharepoint or your school Google Apps cloud or your online iPhoto library or whatever your org or your vendor insists you should have. Even make those the authoritative copies, if that's what organizational policy or personal preference dictates. But for the love of all that's digital do not make a habit of relying on data to be there when you need it unless you can put your actual physical hands on media containing a copy of it that you decided how to organize. That way lies only suffering. The way to get stuff onto shared storage ought to be via file browser or backup script, never by Save As.

Also, USB flash memory sticks are the floppy disks of 2021. If you absolutely must use these toys for anything less ephemeral than a few hundred photos on their way to the print shop, keep at least three identical copies.
posted by flabdablet at 3:34 AM on November 24, 2021 [3 favorites]


The pages of the website for this system don’t appear to be organized by this system. Does that tell you anything?
posted by 3.2.3 at 5:34 AM on November 24, 2021 [2 favorites]


posted by 3.2.3 at 9:34 AM on November 24 [+] [!]

Eponysterical?
posted by snofoam at 5:48 AM on November 24, 2021 [4 favorites]




Categorization? Really, people! Why go through all that unnecessary complication when you could just assign every file a Gödel number and remember that?
posted by CheeseDigestsAll at 6:31 AM on November 24, 2021 [5 favorites]


You laugh, but that's essentially IPFS.
posted by flabdablet at 7:24 AM on November 24, 2021 [1 favorite]


I keep lot of notes and docs in a little system of just year based text files. The techniques I use allow me to quickly add an item without distraction and I successful retrieve items from years ago. At this point I have good records going back to 2013.

It is critical to reduce the cost of taking a note or filing an item. If it is expensive to create and file an item, then it won't get done, if it didn't get recorded then no categorization system will allow you to retrieve it. You cannot know a head of time what you will want to retrieve in the future. So I shift the costs toward retrieval. On retrieval, once a piece of information is needed, the value can be taken into account and a decision can be made to dedicate 5 seconds or 5 minutes to retrieval. I find I have no problem retrieval.

The editor I use is setup to "capture" a new entry. When I do a capture it drops me into a new editor window, with a date heading and I type my notes. When I hit save, it returns me to the editing window I was in before. This is so quick that there is a qualitative difference between slower techniques. I'll record a one sentence note. That one sentence note will be revisited with another idea. Then I'll come back and brainstorm some options, why not type them out? Then I experiment with these and take note of why each option is ruled out. Then 4 years later, I have a problem in the same area and I can revisit this. This wouldn't happen if I there was any friction between normal work and taking notes. Well, maybe this isn't true because now that I have seen the what having notes is like, I value it much more.
posted by bdc34 at 8:08 AM on November 24, 2021 [2 favorites]


It is critical to reduce the cost of taking a note or filing an item.

This. Which is exactly why the proposed system, with its finicky numbering scheme and its need to maintain a central index in parallel with all the various data hoards it's trying to unify, will end up on the rot pile with all the other Grand Unified Theories of data organization.

It's just a fact that old stuff that doesn't get looked at very often is always going to be more trouble to dig out. Deal with it.

I shift the costs toward retrieval

...is the correct approach.
posted by flabdablet at 8:15 AM on November 24, 2021 [1 favorite]


My brain could not process anything beyond the fourth section.
I have cultivated about me a sort of motile swamp of experiences, and at hand is food and knowledge, non-specific, and I reach out and take the thing in front of me.
Is it the best thing?
Never. It is the near-at-hand thing.
I can search my swamp, if I'd like. I have, at present, nearly 70k unread emails. My gdoc is like a galloping nightmare of documents, I have a few piles of old hard-drives pried from dead laptops.
Dumping into endless databases and usb drives.
When I run out of digital space I faithfully give $20 to the Google and I move forward.

I will do this until I die and then it will no longer be my problem.
posted by Baby_Balrog at 8:16 AM on November 24, 2021 [4 favorites]


I think the point of this article is : "use numbers along with your folder names, it helps create a stable 'geography' of folders. that will help you remember where things are more than you think. using names only will mess up the usefulness of this order."

and another bit is "try not to make the subfolders go too deep."

which is helpful to remember. I have numbered my folders on occasion. I have never moved to "numbers-only," and god help anyone working with someone who does.

But, as people have pointed out, just because you develop a 'geographical' memory of where folders are, doesn't mean your co-worker will. you still need names in a regular language that can be read, and nowadays, be searched.

And there's basically a bunch of other whoo-doo in there that is an insight into this guy's personal weirdness about how he files his emails and invoices.
posted by eustatic at 9:12 AM on November 24, 2021 [2 favorites]


36 92 70 74 49 66 99. 49 15 99 74 55 29 64 89 30 64 42 91. 55 29 89 3091 74 67 13.

36=I
92=Think
70=This
74=Is
49=A
66=Great
99=Idea
15=Better
55=To
29=Replace
64=All
89=Words
30=With
64=Unique
42=ID
91=Numbers
67=More
13=Efficient
posted by AlSweigart at 9:25 AM on November 24, 2021 [4 favorites]


I think everyone should develop their own personal whoo-doo of a system. Dividing everything into ten meta-directories, each with ten subs? Those 10s are incredibly arbitrary and rigid amounts, IMO -- use amounts which make sense to you.

Adding random numbers to the directory and file names just seems pointless and obfuscating to me, like something a clueless and insecure supervisor would mandate for job security.
posted by Rash at 9:28 AM on November 24, 2021 [2 favorites]


Those 10s are incredibly arbitrary and rigid amounts, IMO -- use amounts which make sense to you.

Here arises the division of minds into prime number organizers and composite number organizers, although the primers resent being subdivided. And the 2^n-ers keep trying to put a hex on the rest of us.
posted by clew at 10:11 AM on November 24, 2021 [1 favorite]


Nope-12.08
posted by hypnogogue at 11:17 AM on November 24, 2021


My system is to only have 10 documents at any given time.
posted by mazola at 12:48 PM on November 24, 2021 [2 favorites]


mazola - Append? Replace? Fail?
posted by clew at 12:55 PM on November 24, 2021


You could just buy the Cinco Midi Organiser and give all your files a UMRN (Unique MIDI Routing Number)
posted by credulous at 1:26 PM on November 24, 2021 [1 favorite]


Metafilter: a sort of motile swamp of experiences.
posted by ryanshepard at 1:38 PM on November 24, 2021 [1 favorite]


> The problem with being strict about "no more than 2 layers deep" is if you have a lot of things in a subfolder you'll easily waste more time going through them to find what you want than if you had made an extra layers of subfolders.

> I will also note that all of our file organization systems are metaphors. There's no such thing as directory structure in an actual hard disk, it's all physical location on the platter of individual segments. We impose the directory metaphor because it was, at the time it was developed, a useful metaphor.

The first Macintosh OS filesystem (circa 1984) had a pretty close to literal directory structure on the drive. This worked well enough during the brief time when 400Kb floppies were the only media available, since folders could not be nested (no subfolders!) and the 1400 file limit was only going to be exceeded by somebody being deliberately perverse. Apple ditched MFS in favor of HFS in System 2.1, less than two years later, and chaos reigns to this day.
posted by ardgedee at 3:35 PM on November 24, 2021


你笑了, AlSweigart。 但这就是书面中文的运作方式。

你=You
笑了=are laughing
但=but
这=this
就=just
是=is
书面=written
中文=Chinese
的='s
运作=operation
方式=pattern

(apologies to native speakers)
posted by patrick54 at 8:55 PM on November 24, 2021 [1 favorite]


Being proud of data dump culture is something I’ll never understand. Whether its Amazon Inc being sloppy with millions of users’ data to someone’s personal trillion photo collection spread over disks and servers in landfills all over the place.

Not being able to delete a folder and a backup of that folder because who knows what’s in that dump is not somewhere I’d want to be. Hoarding is hoarding, whether you’ve got a search tool for that dump or not.

I don’t follow Johnny•Decimal’s numbering convention, but I admire the general premise of keeping things tidy. Sure, it’s a bit weird and far from perfect, but making fun of it out of hand is like an SUV driver yelling at a pedestrian because, well, why do people do these things? Bigger is better?

I support.
posted by romanb at 1:45 AM on November 25, 2021 [2 favorites]


Thanks romanb for non-glib response. I thought i was crazy, when i thought that i should try this out. Not all parts of it, but the dont make too many folders and dont go too deep are good advice.
He is still using folder names and such, just that the numbering makes it easier machine readable.
My PC is a dump and i need to clean it up before my next employment.
posted by Megustalations at 3:32 AM on November 25, 2021 [1 favorite]


I liked the bit where his system made everything keyboard-accessible. That speeds me up and improves my posture.

You can also get there with distinct alphabetic names, of course, like … sys, bin, usr… Numbering projects or jobs over the course of a career made sense to me too, though expecting anyone else to see those numbers maybe not.

May all our drives be just as ordered as we like them.
posted by clew at 12:00 PM on November 25, 2021 [2 favorites]


I liked the bit where his system made everything keyboard-accessible. That speeds me up and improves my posture.

Mine too. Which is one of the reasons why most of what I do with this computer that isn't web browsing involves a command line interface.

And I'm pretty good at naming things but not very good at remembering where I put stuff that's more than a few months old, which is why I've been mainly relying on locate to dig out the old stuff since well before Google was ever a thing.

Main thing I use folders for is keeping the files in any given folder strongly related to each other rather than trying to fit them into some kind of predesigned scheme; for me, folders are a tool more for keeping irrelevant stuff excluded than anything else.

May all our drives be just as ordered as we like them

and may we always remember that they are our drives, and if they're not, that's a problem that calls for our urgent attention.
posted by flabdablet at 12:22 PM on November 25, 2021 [1 favorite]


When people say that storage-classification method X is better than search, I say fix the search.

I'm looking at you Windows 10.

Yesterday I wanted to search my downloads folder which contains several thousand files. I knew the file had "wilson" in the file name. One minute later..... I'm still looking at the little bar filling up with green....ye gods.
posted by storybored at 4:57 PM on November 25, 2021


Possibly relevant.
posted by bendy at 1:12 AM on November 26, 2021


do kids not still learn command-line work in high school? as part of computer science education?

I didn't have computer science in school, but I did have typing class, wherein we learned command-line basics in order to type on our Zenith computers.
posted by eustatic at 8:50 AM on November 26, 2021 [1 favorite]


I am really confused about this thread's references to the "taxes" folder structure. I thought it was universally recognised that any folder named "taxes" contains porn?
posted by rum-soaked space hobo at 8:54 AM on November 26, 2021 [4 favorites]


Do kids not still learn command-line work in high school? as part of computer science education?
I certainly never learned it, in highschool in the mid '90s or in actual CS classes in college soon after that. (I did learn it while in college, but not formally.) Everything in class was proprietary windows software, IDEs, and incredibly tedious gui tools. [Insert unhinged rant about Scheme here.] Today, in my field, the equivalent seems to be browser-based python notebooks. Working with new grad students in the physical sciences in the last few years, only hobbiest hackers seem to have seen a shell prompt before. I'm sure the other students know useful things that I don't know, but watching them try to do things like rename multiple files occasionally makes me want to hit my head on a desk.
posted by eotvos at 8:35 AM on December 2, 2021


eustatic: "do kids not still learn command-line work in high school? as part of computer science education?
"

I'd be shocked if they did.

In my case, the one CS class I took in college (which was far enough back that expecting you to be familiar with a command-line environment would be completely unsurprising) assumed you already knew Unix, and just threw you into the deep end. I didn't, which is why I only took one CS class in college.
posted by adamrice at 10:07 AM on December 2, 2021


do kids not still learn command-line work in high school? as part of computer science education?

This seems to assume a level of uniformity in highschool education that I just don't think exists. Computer science isn't a required part of Common Core, which is really the closest thing we have to a national curriculum in the US. So, sure, students at a well-funded STEM magnet school are going to run across command lines, and a lot of technical stuff besides, someone from a more run-of-the-mill school, especially if they were a more run-of-the-mill student, very likely wouldn't have.

That said, there's probably been no better time to be a self-directed learner interested in IT, because of the wide availability of tools and educational materials, and the relatively low cost of hardware.

I think this is part of why the IT field is chock full of somewhat-self-taught people with suspiciously similar professional origin stories, generally hinging on a significant amount of self-directed study (and requiring the free time, energy, and access to equipment and materials that it implies). It's a pretty strong filter.
posted by Kadin2048 at 10:08 PM on December 4, 2021


Not just high school, given the title "Missing Semester" for a university lecture series. (V good, I thought, filled in some corners that self-teaching hadn’t).
posted by clew at 11:45 AM on December 5, 2021 [2 favorites]


« Older the friends you make online   |   An excellent piano lesson Newer »


This thread has been archived and is closed to new comments