Regex Dictionary
December 21, 2009 4:19 PM   Subscribe

Regex Dictionary - for those times when you want a web-based dictionary you can search with regular expressions.
posted by Wolfdog (31 comments total) 51 users marked this as a favorite

 
The finding unusual words examples are pretty neat and suggest some of the possibilities.
posted by Wolfdog at 4:21 PM on December 21, 2009 [2 favorites]


Those times come up pretty often for me
posted by motorcycles are jets at 4:28 PM on December 21, 2009 [3 favorites]


Perfect for crossword puzzles.
posted by Blazecock Pileon at 4:28 PM on December 21, 2009 [2 favorites]


In a somewhat similar vein: Omnipelagos finds the shortest paths between any two things.
posted by netbros at 4:33 PM on December 21, 2009 [2 favorites]


Damn, where was this yesterday when I was trying to solve some flash-based cypher.
posted by AndrewStephens at 4:38 PM on December 21, 2009


Instead of teaching kids how to use a paper dictionary, every highschool graduate should know regex.
posted by amuseDetachment at 4:44 PM on December 21, 2009


Holy crap, you just made the nerd in me jump for joy.
posted by Afroblanco at 4:45 PM on December 21, 2009


Oddly enough: We couldn't find any matches for the string /regex/ for any part of speech.
posted by twoleftfeet at 4:46 PM on December 21, 2009


Needs geographic nouns.

It also needs some tips on how to do those damned annoying "cryptic" crosswords.
posted by GuyZero at 4:47 PM on December 21, 2009


Died at index.cgi line 24.

nonetheless, i have this bookmarked for when it comes back alive. thank you, wolfdog.
posted by the aloha at 4:48 PM on December 21, 2009


every highschool graduate should know regex

I don't think there's a formal standard for regex syntaxes. They all work more or less the same but I don't think there's an ANSI or ISO standard or anything. grep and most programming regex libs work somewhat differently for example. See ".*" vs "*".
posted by GuyZero at 4:50 PM on December 21, 2009


it's alive!
posted by the aloha at 4:50 PM on December 21, 2009


Man, I need to learn me some regular expressions.

I have ladies to impress!
posted by Askiba at 4:55 PM on December 21, 2009


GuyZero, if by ".*" vs "*" you mean "*" like in filenames, that's a glob, not a regex. There are a lot of extensions, and some deviations on the grouping standards, but ".", "[]", "*" and "?" mean the same in every variation I've seen ("+" isn't standard, but works in every implementation I've ever seen as well)
posted by qvantamon at 4:57 PM on December 21, 2009


Yes, regexs are regexs mostly, but it's just a de facto thing. It would be fairly weird to have a regex lib that substitutes % for ? but there's no rule that dictates it.
posted by GuyZero at 5:00 PM on December 21, 2009 [1 favorite]


This is fantastic! I mean unix dorks have been doing grep -E 'regex' /usr/share/dict/words forever (it's how Samba was named) but it's nice to have a web gui on top of it.
posted by Skorgu at 5:00 PM on December 21, 2009


That's worth a bookmarkin'! Thanks, Wolfdog.

*brain creaks and wheezes trying to dredge up regex know-how*
posted by Quietgal at 5:02 PM on December 21, 2009


that's a glob, not a regex

No, doing a `grep "foo*" bar` is different from writing a script in Python or Perl and matching each line against "foo*", or worse yet, matching the entire file against "foo*". And bash's filename pattern matching/processing isn't quite globs or regexs. e.g. echo ${PWD##/*/}
posted by GuyZero at 5:03 PM on December 21, 2009


They all work more or less the same but I don't think there's an ANSI or ISO standard or anything

POSIX regex
posted by DU at 5:04 PM on December 21, 2009 [5 favorites]


GuyZero: Yeah there's differences at the edges, but in practice, there's support for a whole lot of Perl regex tokens. People just need to understand the basics, the efficiency gap is getting huge.

Also, those in Ubuntu may want to do:
sudo apt-get install wamerican
awk /hell.*o/ /usr/share/dict/words
posted by amuseDetachment at 5:08 PM on December 21, 2009


Is greedy vs non-greedy matching part of the POSIX heading on that wikipedia article? Anyway, duh, POSIX. Thanks.
posted by GuyZero at 5:10 PM on December 21, 2009


Yeah, there are both real and de facto standards, but there are some major exceptions. The pseudo-regexes used by LexisNexis and WestLaw, for example, practically turn normal regexes on their head. * matches a single character and ! matches any string but only to the end of a word. So for example, f**t would match feet and foot. pat! would match path, paths, patch, patches, etc.
posted by jedicus at 5:26 PM on December 21, 2009


I don't know if lazy quantifiers are POSIX. I just use Regex As Was Handed Down By The Sages Of Yore (i.e. anything that the tool I'm using [grep, awk, sed and Tcl] will accept).
posted by DU at 5:26 PM on December 21, 2009


amuseDetachment: s/wamerican/wamerican-huge/ unless you're pathologically short on disk space (it's Six! Whole! Megs!)
posted by Skorgu at 5:26 PM on December 21, 2009


Yeah, there's probably actually been a time where I've wanted that.

There really is no hope for me.
posted by DecemberBoy at 6:38 PM on December 21, 2009


awk /hell.

Funny. That's how we describe my uncle Norm.
posted by jock@law at 7:46 PM on December 21, 2009


Wow - failed on the first example I tried, which was "uptc" - apparently the dictionary does not know the word "bankruptcy"
posted by kcds at 8:01 PM on December 21, 2009


This is absolutely amazing, and perfect timing for a project I'm working on!
posted by iamkimiam at 11:25 PM on December 21, 2009


There are different flavors. Basic regex is what all the old unix tools originally used and is what you get if you just use "grep", extended regex (also called posix regex) is what you get with egrep or grep -E and is what most programs are referring to if they allow regex searches, and PCRE or Perl-compatible regex is like Posix regex with a whole shit-ton of extensions.
posted by Rhomboid at 2:43 AM on December 22, 2009


Also, surprised no one dropped the Jamie Zawinski quote yet:

"Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems."
posted by GuyZero at 9:47 AM on December 22, 2009 [1 favorite]


That's a great site! There are 408 words of 2 or more letters that can be written with just the top letters of my keyboard, including "typewriter".
posted by mdoar at 1:10 PM on December 22, 2009 [1 favorite]


« Older Parkour + juggling + wushu + cigar boxes + ... (SL...  |  9 Countries was recorded on lo... Newer »


This thread has been archived and is closed to new comments