12 Comments

Wow, that’s fantastic! I love how you’ve put so much thought into this. It really shows in the details. It’s always inspiring to see someone so passionate about their work. Keep up the amazing effort! I can’t wait to see what you come up with next. https://gmconcreteandexcavation.com/

Expand full comment

Verbal expression is also a fun read, how people want to clarify regex with syntactic sugar.

Expand full comment

I am not sure why perl would be mentioned in this context, since it came really, really late to this game. The first release of perl was in 1987. Regular expressions were spread across the CS curriculum by the mid 1970s - for example in lex and yacc. Even if you didn't learn it in class many folks worked with it using ed and sed and vi by the mid '70s.

Expand full comment

small typo: looks like you forgot to escape the period in a regex. So instead of "^t.co$", it should be "^t\.co$".

Just goes to show that regexes are tricky!

Expand full comment

«This meant that sites such as “microsoft.com,” “reddit.com,” and even Russia’s own state media outlet “rt.com” were rendered suddenly inaccessible.»

It’s really not a good idea in when discussing regex and literal text to include extraneous characters like commas in the quoted examples. “microsoft.com,” is not a valid domain.

Expand full comment

Well, the “t.co” problem probably wasn’t a regex. If they were naive enough to misuse a regex, they likely would have just used “t.co” itself in which the “.” stands for any character at all. They probably just did a simple search for those characters. Conversely, with a proper regex, it would be easy to search only at the end of the string!

Expand full comment

Came hare to say this. Regex nerds unite!

Expand full comment

Yes, exactly, it would have blocked tAco, tBco, etc. etc. because dot means “any character”.

Expand full comment

Russia trying to prevent its citizens from searching for delicious "taco" meals.

Expand full comment

I'm sure you've heard this a lot by now, but "a*b*" is not the regex for the grammar you described. It should be "|ab|*".

Expand full comment

I think you mean "(a|b)*", unless the forum software translated it to "|ab|*"

Expand full comment

Just realised perhaps you were going for "[ab]*"

Expand full comment