Common Markdown, a robust and standardized subset of Markdown
August 10, 2022 1:22 PM   Subscribe

An old thing, but a good thing.
CommonMark [is a] strongly defined, highly compatible specification of Markdown[, …] a plain text format for writing structured documents, based on formatting conventions from email and usenet… the following sites and projects have adopted CommonMark: Discourse, GitHub, GitLab, Reddit, Qt, Stack Overflow / Stack Exchange, Swift (Markdown for MeFi)
Reference Card and Interactive Tutorial
posted by Going To Maine (62 comments total) 23 users marked this as a favorite
 
Well, Allright.

Huh. Not what I expected.
posted by Bee'sWing at 1:34 PM on August 10, 2022


strongly defined

Why isn't there an EBNF grammar? Is it even deterministically parseable?
posted by 1970s Antihero at 1:50 PM on August 10, 2022 [8 favorites]


Well, now I really want to know what you expected!
posted by Going To Maine at 1:50 PM on August 10, 2022 [6 favorites]


I was trying Heading 1
expecting big and bold. But, no dice. (it works on reddit).
posted by Bee'sWing at 1:55 PM on August 10, 2022


Note CommonMark is 8 years old now. A much needed standardization effort. There was a small nontroversy about the name which is why it's no longer called Standard Markdown. But to the extent there is a standard markdown, it is this.
posted by Nelson at 1:55 PM on August 10, 2022 [4 favorites]


Yep. Not new, but surely the best of the web.
posted by Going To Maine at 1:57 PM on August 10, 2022 [5 favorites]


Mark* formats have not been debated in a good while. LET THE BATTLE BEGIN
posted by Abehammerb Lincoln at 2:28 PM on August 10, 2022 [2 favorites]


Why isn't there an EBNF grammar? Is it even deterministically parseable?

Considering og Markdown was parsed by a badly written throwaway Perl script... *hands waving around in the air*
posted by alex_skazat at 2:56 PM on August 10, 2022 [2 favorites]


Previously
posted by a snickering nuthatch at 2:58 PM on August 10, 2022 [1 favorite]


Take care not to call it Common Markdown; that pisses off John Gruber (Markdown's creator) too, as noted at the very end of the post Nelson linked to.

My understanding is that Gruber has consistently declined to engage with anyone who wants to refine or improve his original spec, which he maintains is adequate. This has long reminded me of Dave Winer's approach with the RSS 2.0 spec, which he declared "frozen" for all time. Neither stance has served its community of users well. Because of Winer, we have Atom (for which I am grateful), and because of Gruber, we have CommonMark (ditto).
posted by /\/\/\/ at 3:01 PM on August 10, 2022 [12 favorites]


Yeah, I'm lazy and still use Gruber's Daring Fireball flavor. Does everything I need and more...
posted by jim in austin at 3:14 PM on August 10, 2022


Now if we could just merge CommonMark and Atom and include code to send and read email...
posted by sammyo at 3:36 PM on August 10, 2022


Why would it need EBNF? It’s not a language, just a core set of formatting markers that can be matched without the need of a complex parser. It’s kind of built to be handled by simple regex.

Any language-related things are extensions for specific purpose, but mostly able to be handled by a simple regex, as well.
posted by JustSayNoDawg at 3:44 PM on August 10, 2022


Phew that old thread is quite a journey. Many nerds (including myself) expressing many opinions, lots of gnashing of teeth about the naming nontroversy.

I'll stand by what I stood there: what really mattered in the end is what major Markdown-using sites did. And the choice is overwhelming CommonMark or CommonMark-derived. GitHub is based on CommonMark. Reddit migrated to CommonMark. Hugo, the blog engine, uses CommonMark. And of course Stack Exchange uses a CommonMark derivative. Discord may be an exception; it's a very limited subset suitable for the chat app, but then they added a spoiler tag (still missing from CommonMark).

I'd forgotten that Aaron Swartz was part of the original Markdown work. It's been almost two years since I felt a strong twinge of sadness at his passing, but now I reset the counter.
posted by Nelson at 3:47 PM on August 10, 2022 [12 favorites]


Sadly, the Markdown for Metafilter for Safari won't work with modern Safari. Seems ripe for reimplementation as a userscript.
posted by adamrice at 3:57 PM on August 10, 2022 [1 favorite]


![Under construction](https://upload.wikimedia.org/wikipedia/commons/d/d9/Under_construction_animated.gif)
posted by gwint at 4:12 PM on August 10, 2022 [3 favorites]


Github has a good blog post about their experience migrating from Sundown to their fork of cmark (the CommonMark reference implementation).
posted by credulous at 4:16 PM on August 10, 2022 [2 favorites]


Markdown and its various flavours have been so dominant that I had to look up the name of what was once a robust competitor to Markdown but has since disappeared for somewhat unknown reasons: Textile.

I'm not sure why one disappeared while the other thrived. Admittedly, I haven't thought about it in at least a decade.
posted by chrominance at 4:17 PM on August 10, 2022 [2 favorites]


I mean, there’s also reStructuredText out there, but I think it’s doing ok - I just don’t want to ever use it.
posted by Going To Maine at 4:22 PM on August 10, 2022 [1 favorite]


Previously

Kinda bummed now it wasn't called Fartgoop.
posted by alex_skazat at 4:34 PM on August 10, 2022


Two tangents for you. One, pandoc is indispensible to my workflow. I start everything in Markdown and convert it at the last possible moment with pandoc, which seems to be able to output everything under the sun. I hadn't realised that Pandoc's author is one of the people behind Commonmark but it makes sense.

Second, I recently read Hillel Wayne noting that Markdown is not suitable for genuinely structured text. And I think he has a solid point. But I'm not gonna memorise RST at this stage of my life until I really, really need to.
posted by i_am_joe's_spleen at 4:39 PM on August 10, 2022 [3 favorites]


This has long reminded me of Dave Winer's approach with the RSS 2.0 spec, which he declared "frozen" for all time. Neither stance has served its community of users well. Because of Winer, we have Atom (for which I am grateful), and because of Gruber, we have CommonMark (ditto).
Maybe these aren't analogous because it turns out that Atom wasn't really needed.

RSS feeds are (still) way more popular than Atom feeds. Atom feeds are secondary, so every Atom feed has an RSS feed but not vice-versa. Atom use these days is due mostly to CMSs defaultly creating Atom feeds to complement the main (RSS) feed, which neither users nor subscribers are generally aware of (which is fine).

Yes, some people have personal issues with Dave. Yes, the RSS spec was imperfect — but it had/has one. RSS will outlive us all. Sometimes it's okay for something to be "good enough".
posted by ArmandoAkimbo at 4:42 PM on August 10, 2022 [6 favorites]


I think Markdown is kind of dumb. For most use cases the HTML isn't that much more complicated.

## Heading 2
[h2]Heading 2[/h2]


I know, but the real tags get stripped.

My use case is a web developer who would only consider Markdown for nontechnical people, and in my experience people who can't handle HTML likely also can't handle Markdown.
posted by kirkaracha at 5:06 PM on August 10, 2022 [4 favorites]


the following sites and projects have adopted CommonMark:
Also, the open source note-taking app Joplin, which finally got me off Evernote.
posted by Hardcore Poser at 5:14 PM on August 10, 2022 [1 favorite]


Markdown or whatever does make sense when you look at the primordial goo it crawled out of. If I am writing a plaintext email, and I want some sort-of formatting, it starts looking like Markdown, so it does make sense to just make it HTML for me, for things that know how to display HTML. I added support to my thingy that you can send plaintext email to, that then makes it HTML. Markdown really does work better than any other Plaintext to HTML function I've ever wrote or seen so I actually use it for that job, too, app-wide (but I'm evil).

Also it's a total PITA to accept HTML in things like this very comment box but only a small subset and not find any of the weird edge cases so once you've thrown up your hands enough times you go, "fine you win, no HTML is allowed, but if we find something that looks like Markdown, that'll get displayed as HTML".
posted by alex_skazat at 5:14 PM on August 10, 2022 [4 favorites]


Although Markdown seemed a really curious creation once you realize Perl already has Perl POD, which is another dead simple way to format plain text, and Markdown was written in Perl. I guess there's more than one way to do it!
posted by alex_skazat at 5:16 PM on August 10, 2022 [2 favorites]


I work in an ordinary white-collar office, which is like everywhere else dominated by doing all our reports in MS Word. A couple of years ago I spent a few weeks putting together my own pandoc templates (NB. that this could have been done by a competent technician in an afternoon, but I was learning) and now I just go markdown --> pandoc --> docx or latex --> PDF, and get a better and more reliable result. I've drawn up a serious reference document this way, of several hundred pages, with cross-references in the PDF that actually work, a bibliography, and a full index.

'Good enough' is absolutely right. The alternative isn't a perfect markup format to be polished like a scholastic argument, the alternative is getting elbows-deep with Word templates from 2006 that don't have styles in them and have all their images in nested tables. My God. My God.
posted by Fiasco da Gama at 5:17 PM on August 10, 2022 [5 favorites]


Markdown is brilliant. The simple syntax gets out of the way enough to make it human readable, but is structured just enough that it can be interpreted by parsers and turned into more complex formats like styled HTML. Even when I'm not in a Markdown enabled editor I find myself just automatically using the syntax because it Just Makes Sense.
posted by gwint at 5:19 PM on August 10, 2022 [4 favorites]


Can you embed latex in markdown? Cause Σ ( √ xi² + ε ) is just a little bit, but important to sum.
posted by sammyo at 5:43 PM on August 10, 2022 [4 favorites]


HTML is certainly not as easy to write as Markdown, nor is it as easy to read. And yes, Pandoc is actually some sort of magic.
posted by lhauser at 5:53 PM on August 10, 2022 [5 favorites]


What ever happened to HyTime? Hypermedia and time based structuring language.
posted by ahimsakid at 6:14 PM on August 10, 2022 [1 favorite]


Markdown isn't meant to supersede LaTeX or HTML or any other markup language. It was written as a low-effort way to make human-readable plaintext that can be run through a parser to make a document with richer formatting (typically HTML), originally for stuff like blog posts. Its popularity is almost entirely down to its simplicity and ease of handling; no HTML WYSIWYG editor needed, no HTML sanitizer logic required, and not clunky to write like BBCode, which makes it awesome for use on websites.

"Can you do X in markdown" where X is a thing that typically is done in a more purpose-built markup language, is one of those things where it's like, yeah, you can and many have, but it's not all that impressive as all you're doing typically is adding that heavier markup embedded inside the markdown doc somewhere and forking the parser to interpret it and insert it into that richer document format. It's better to think of it as a transpiled language that outputs to some other markup language or document format as that's almost always how it's used.
posted by Aleyn at 6:18 PM on August 10, 2022 [5 favorites]


I once wrote a proposal for a job in Markdown and converted it to PDF with Pandoc. Apparently it accomplishes this by first converting it to LaTeX. It ended up looking like an academic paper. The people I sent it to were super impressed and took me for a much more serious computer scientist than I am.
posted by jordemort at 6:27 PM on August 10, 2022 [16 favorites]


I like Sphinx paired with MyST (CommonMark with RST-like sprinkles). It's definitely more heavyweight than Pandoc, though. I might use them both in parallel just to keep them both honest.

I used to think Pandoc could parse LaTeX as a source format, but I was wrong. Even LaTeX can't fully parse LaTeX.

That is a funny anecdote about the hideous 1980s default LaTeX template, jordemort. Anyone who has used LaTeX extensively will be super impressed if you can make its output *not* look like it came from LaTeX.
posted by credulous at 6:49 PM on August 10, 2022 [1 favorite]


> Reddit migrated to CommonMark.

Oh? Last I saw triple-backtick code blocks were horribly broken.
posted by genpfault at 6:56 PM on August 10, 2022


In re Reddit: and a whole lot more!

Seriously, Reddit just keeps getting brokener and brokener. God only knows how many special-purpose apps are out there to read a site that should function in a damn browser.
posted by JustSayNoDawg at 7:10 PM on August 10, 2022 [3 favorites]


I also use a workflow based on Markdown, Pandoc, and GPP for some of my work (Pandoc-flavored Markdown has a lot of useful extensions). Took me a long time as a quasi-technical person to figure it out, but the results are pretty satisfying, and the fact that I can produce clean HTML and an ICML file from the same source is useful.

AsciiDoc would be better for some of the stuff that I do (Pandoc-flavored Markdown emulates some of its features, IIRC), but like Textile, it never really caught on, and there seems to be only one parser for it.
posted by adamrice at 7:34 PM on August 10, 2022


I saw a high-end programmer comment on his Reddit job interview: they're culturally a media company, not a tech company.

I think it explains the state of their website.
posted by SunSnork at 8:32 PM on August 10, 2022


> Oh? Last I saw triple-backtick code blocks were horribly broken.

The new.reddit Markdown parser did not get backported to old.reddit.
posted by catachresoid at 8:43 PM on August 10, 2022 [1 favorite]


I'm another person who uses Markdown/Pandoc in their workflow. I even set up a save-as-word function in EMACS that, well, saves a buffer as a Word document.

Let me crowdsource a bit: Any filters/plug ins to allow Word, OneNote, and/or Outlook use standard Markdown/Common Mark? I really find typing the symbols easier than clicking the bar. I know they support their own version (asterisk-text-asterisk is bold rather than italics), and I find CTRL-B a bit unpredictable (and not standard across applications).

In a way, my preference is in part driven by sometimes not knowing where a formatting ends. I can see it. It remind me of Reveal Codes in WordPerfect. Yes, I'm one of those guys.

I think Markdown is kind of dumb. For most use cases the HTML isn't that much more complicated.

I don't know. Italics in Markdown is two keystrokes (asterisks on either side); HTML requires [i] and [/i] (retaining the convention to account for comments now showing the greater/less than symbols. Basically, it's less typing.

What's more, it's easier to read the source file. cat readthis.md looks a bit easier to follow; cat readthis.html throws a lot more to have to parse in your head. Maybe if I lived in HTML all day, they would be equivalent.
posted by MrGuilt at 8:52 PM on August 10, 2022 [4 favorites]


Ooh, nifty! I have opinions on Markdown; I've gotten to use it a lot, in turning a hefty (around 1000 pages) scanned book / ebook into a GitHub repo of Markdown files, full of Markdown edge cases. Using a handful of different parsers is interesting; not breaking in either GitHub or pandoc can be tricky. On my iPad, I've used a notes app to compose and a Markdown-to-html webapp for comments and posts here; it's somewhat easier than checking and fixing smartquotes or going allll the way over to a computer.

What's worked for the book: use Markdown when you can, but html is available as an escape pod when it's needed.


> now I just go markdown --> pandoc --> docx or latex --> PDF, and get a better and more reliable result. I've drawn up a serious reference document this way, of several hundred pages, with cross-references in the PDF that actually work, a bibliography, and a full index.

Can you have links to specific sections between different Markdown files? That's been a major pain point in pandoc for me.


> Can you embed latex in markdown? Cause Σ ( √ xi² + ε ) is just a little bit, but important to sum.

I think that's still up to the parser. GitHub recently added math support, with $ and $$ delimiters. I think pandoc can do something similar? Using SVG files worked better for me, across parsers; I only had around a dozen, and they currently disappear in night mode because of how I made them. I got SVG images working on GitHub, with PNG fallbacks, with careful html. Both yay for this working, and boo that it wasn't a piece of cake.

You can try babelmark to check several different implementations quickly.
posted by Pronoiac at 9:29 PM on August 10, 2022


It's better to think of it as a transpiled language that outputs to some other markup language or document format

Yeah, a bit like .dvi was envisioned by some? But human readable and very different.

Anyway, Common Mark as imperfectly implemented in the wild has been a pleasant surprise in my book: useful, clean, and fairly intuitive. Good job everybody!

Also pandoc + markdown is really nice, and absolutely worth the minimal marginal investment for a lot of people who are already invested in this type of work, especially if they are somehow forced to interact with MS word formats.
posted by SaltySalticid at 9:35 PM on August 10, 2022


Lot of experts in here so let me please take a chance to ask:

Markdown for technical documentation is clearly not recommended (outside of small readme.md files, perhaps) so for folks that do have to create technical documentation at work, what's your fav way to do so?
posted by lazaruslong at 12:30 AM on August 11, 2022


I've used ASCIIDoc for longer, technical documents (manuals for software) and I was very happy with it - it has many similarities to Markdown so there's very little effort needed to do Markdown-style documents, and the extra features are available when you need them.
posted by BinaryApe at 12:35 AM on August 11, 2022 [2 favorites]


In my workplace, (IT services company that does a bunch of bespoke dev too) Sphinx using restructuredText is the preferred way.
posted by i_am_joe's_spleen at 1:36 AM on August 11, 2022 [1 favorite]


It's still a trivial solution to a trivial problem being celebrated as the best thing since sliced bread. Dropping down to HTML when you need to use unsupported features (internal xrefs, tables) is such a huge cop-out
posted by scruss at 1:45 AM on August 11, 2022


oh jeez and here I am just wishing Atlassian would pick any one flavour of Markdown to use consistently
posted by harriet vane at 3:24 AM on August 11, 2022 [7 favorites]


Can you have links to specific sections between different Markdown files? That's been a major pain point in pandoc for me.

Sphinx (RST and/or MyST) is a cross-referencing beast. It would just be nice if you could install it without spewing Python packages everywhere; I have to isolate it in its own Conda environment and then pip freeze when I change anything lest I forget.
posted by credulous at 5:53 AM on August 11, 2022 [1 favorite]


for folks that do have to create technical documentation at work, what's your fav way to do so?

I like LaTeX for that, but full disclosure, it's also my favorite way to create any text document longer than an email, or if it's something people might print on paper.
posted by SaltySalticid at 5:57 AM on August 11, 2022 [3 favorites]


Maybe these aren't analogous because it turns out that Atom wasn't really needed.

Agreed. That totally unnecessary "Dave is abrasive, let's work around him" pissing match accomplished nothing but a lot of heat and noise. The original justification for Atom's necessity looks painfully thin in hindsight.
posted by ook at 6:34 AM on August 11, 2022


Though RSS is effectively a marginal, if not dead, standard. The open ecosystem of RSS feeds has been eaten away by Zuckerbergian casino-style silos that can manipulate engagement to maximise revenue. RSS's main use case these days is in podcast browsers (and even there, Spotify's internal silo-based “podcast” functionality, which does not use RSS or any other mechanisms exposed to the outside world, is encroaching on it).
posted by acb at 6:51 AM on August 11, 2022 [1 favorite]


I can assure everyone that there is a great deal of difference between these following two lines:

## Heading 2
[h2]Heading 2[/h2]

The first line is far, far easier to write than the second.

I remember (or likely misremember) a story about Digg and how changing voting on posts from a two-click process to a one-click process increased engagement by 400%

When I was learning Unix, I thought it weird that the copy command was "cp" instead of DOS's "copy". Was it that big of a deal to not type two characters? I would later find out that when you're typing commands all day: yes, absolutely.

You can't dismiss (though engineering nerds like myself often do) how much convenience and a better user interface matter, even when it's a seemingly small change. Neal Stephenson wrote about this in Snow Crash as far back as 1992:
It wasn’t until a number of years later, when they both wound up working at Black Sun Systems, Inc., that he put the other half of the equation together. At the time, both of them were working on avatars. He was working on bodies, she was working on faces.

She was the face department, because nobody thought that faces were all that important — they were just flesh-toned busts on top of the avatars. She was just in the process of proving them all desperately wrong. But at this phase, the all-male society of bit-heads that made up the power structure of Black Sun Systems said that the face problem was trivial and superficial.

It was, of course, nothing more than sexism, the especially virulent type espoused by male techies who sincerely believe that they are too smart to be sexists.
posted by AlSweigart at 7:02 AM on August 11, 2022 [2 favorites]


It's still a trivial solution to a trivial problem being celebrated as the best thing since sliced bread. Dropping down to HTML when you need to use unsupported features (internal xrefs, tables) is such a huge cop-out

I think it's "the problem we can solve, with the amount of effort we're willing to invest", as compared to "the problem(s) we should solve", and I'm reluctant to call that a cop-out. Dom-diff and dom-merge, those aren't easy. Interactive terminals that support an HTML5 display layer and are backwards compatible, that's a hard problem.
posted by mhoye at 7:08 AM on August 11, 2022


i_am_joe's_spleen: "I recently read Hillel Wayne noting that Markdown is not suitable for genuinely structured text."

There is Lightweight DITA, which is definitely structured text, and is probably more "heavyweight" than RST, but does let you use Markdown where possible. Although it seems to ignore some of the extensions to Markdown that allow for succinct markup, and fall back to HTML elements.
posted by adamrice at 8:40 AM on August 11, 2022


It's still a trivial solution to a trivial problem

Text files that are human readable, backwards compatible, yet structured enough to be parsed easily for styled display-- this is a notoriously difficult problem. The praise, I think, is due to finding that sweet spot of usefulness and then not going further, because if you keep going one day you wake up and you've invented LaTeX.
posted by gwint at 4:11 PM on August 11, 2022 [5 favorites]


"Prettier", the JS/TS/etc. autoformatter, will now automatically and by default reformat your markdown files and it makes me think that perhaps Gruber had a point objecting to a formal spec.
posted by Pyry at 6:29 PM on August 11, 2022


I implemented markdown for thoughts.page, and to be honest, I regret it. I've been meaning to write about why, but the long and short of it is that it's often very difficult to get specific things to show up, since there's no clear way to escape stuff (and in many implementations no way to escape stuff at all). I refuse to take seriously a markup language that does not allow you to represent every possible plaintext — a metric which essentially all markdown implementations fail.

For a while, tildes were a sort of ~~~sarcasm mark~~~, but that's strikethrough in most markdown implementations, which has largely killed that bit of internet language altogether.

At some point I'm probably going to implement a dropdown next to the post field to select the formatting from one of markdown, html, and plain text, but that's work I haven't bothered with yet.
posted by wesleyac at 6:48 PM on August 11, 2022 [3 favorites]


then of course there is Morgan McGuire's markdeep, in case we need another standards battle to never be resolved.
posted by bruceo at 7:20 PM on August 11, 2022


Text files that are human readable, backwards compatible, yet structured enough to be parsed easily for styled display-- this is a notoriously difficult problem.

Seconding this.

Authoring good-looking, readable, non-trivially formatted plaintext files is surprisingly difficult.

For instance, the IETF RFC documents are generally written in an editor of your choice, in the markup of your choice (wouldn't surprise me if it's Markdown now, at least in some cases), then massaged into XML. The XML is processed to produce those beautiful (well, kinda) paginated text files we all know and love. I think back in the day it used to be TROFF macros to produce the final text, but pretty sure it's been XML for at least the last 10-15 years.

Plain text with Markdown markup has some advantages to WYSIWYG (whether it's Word or HTML or RTF or whatever on the backend), in particular that it doesn't have any invisible shit in it. Except for the trailing spaces at the end of lines having significance (and many editors will show them when editing Markdown files), there's nothing hidden in a Markdown-text file to bite you in the ass later.
posted by Kadin2048 at 8:26 PM on August 11, 2022


“tHeRe’S nO rEpLacEmEnT fOr ThE iRoNiC tIlDe”

Seriously, though, alternating case is much more frustrating to type.
posted by Going To Maine at 10:22 PM on August 11, 2022


(But I also probably use strike-through more than sarcasm indicators, or just fall back to the already overloaded italics.)
posted by Going To Maine at 10:23 PM on August 11, 2022


Aleyn: Its popularity is almost entirely down to its simplicity and ease of handling; no HTML WYSIWYG editor needed, no HTML sanitizer logic required, and not clunky to write like BBCode, which makes it awesome for use on websites.

Danger, Will Robinson!

Markdown, by design, passes HTML through unfiltered. Be very careful if using Markdown in a context where "HTML sanitizer logic" is a relevant concern. There are extensions for HTML sanitizer logic, but they are implementation-dependent and not part of the Markdown or CommonMark specs. Make sure you understand what you need and what you're getting.
posted by swr at 9:05 AM on August 12, 2022 [1 favorite]


« Older but maybe I should   |   How polarized is WI politically? Historically, the... Newer »


This thread has been archived and is closed to new comments