Regex-Slidedeck



Regex-Slidedeck

0 0


Regex-Slidedeck


On Github ClockworkNet / Regex-Slidedeck

RegEx

The Worlds Best / Worst Problem Solver

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

- Jamie Zawinski / @jwz

Oh hey, these are some notes. They'll be hidden in your presentation, but you can see them if you run the speaker notes server.

It's Ubiquitous

  • Text Editor Search & Replace
  • Form Validation
  • CSS Selectors
  • Bash
  • Almost all languages

Text Editors

(emacs / vim / Textmate / Coda / SublimeText)

Form Validation

(AMM / HTML5 Validation)

CSS Selectors

( a[href$=".pdf"] ) Very loosely based on regular expressions, some modern selectors utilize the leading caret (^) and the trailing dollar sign($).

Great for enhancing markup based on filetypes or on destinations.

Bash

( grep / ack / sed )

Languages

Javascript

"string".match( /pattern/ )

PHP

preg_match( "/pattern/", "string" );

Ruby

"string" =~ /pattern/

Basics

Characters

  • Literals
  • Special Characters:
    • Bell: \a
    • Escape: \e
    • Form Feed: \f
    • Vertical Tab: \v
    • Line Feed (soft break): \n
    • Carriage Return (hard break): \r
    • Tab: \t

Dot

  • .

Character Sets

  • Characters: [abc]
  • Ranges: [0-9a-zA-Z]
  • Shorthand: [\d \w \s \D \W \S]
  • Negate: [^a-z]

Quantifiers

  • ?
  • +
  • *
  • {n,m} {n,} {,n} {n}
  • *?

Alternation

  • | (pipe)

Crazy Shit

  • Grouping and
  • Backreferences
  • Modifiers
  • Atomic Grouping and Possessive Quantifiers
  • Lookaround
  • Continuing from The Previous Match
  • Conditionals
  • Comments

It's powerful

Regex of DOOM (catches all legal URI’s as specified in RFC’s)

(?:http://(?:(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?)(?:/(?:(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[;:@&=]))(?:/(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[;:@&=]))))(?:\?(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;:@&=])*))?)?)|(?:ftp://(?:(?:(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[;?&=]))(?::(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;?&=])*))?@)?(?:(?:(?:(?:(?:a-zA-Z\d*[a-zA-Z\d])?).)*(?:a-zA-Z*[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?))(?:/(?:(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[?:@&=]))(?:/(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:@&=])*))*)(?:;type=[AIDaid])?)?)|(?:news:(?:(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[;/?:&=])+@(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3})))|(?:a-zA-Z)|*))|(?:nntp://(?:(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?)/(?:a-zA-Z)(?:/(?:\d+))?)|(?:telnet://(?:(?:(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[;?&=]))(?::(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[;?&=])))?@)?(?:(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?))/?)|(?:gopher://(?:(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?)(?:/(?:[a-zA-Z\d$-_.+!'(),;/?:@&=]|(?:%[a-fA-F\d]{2}))(?:(?:(?:[a-zA-Z\d$-.+!*'(),;/?:@&=]|(?:%[a-fA-F\d]{2}))*)(?:%09(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[;:@&=]))(?:%09(?:(?:[a-zA-Z\d$-.+!*'(),;/?:@&=]|(?:%[a-fA-F\d]{2}))*))?)?)?)?)|(?:wais://(?:(?:(?:(?:(?:a-zA-Z\d*[a-zA-Z\d])?).)*(?:a-zA-Z*[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?)/(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2})))(?:(?:/(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))*)/(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))))|\?(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[;:@&=])*))?)|(?:mailto:(?:(?:[a-zA-Z\d$-.+!'(),;/?:@&=]|(?:%[a-fA-F\d]{2}))+))|(?:file://(?:(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))|localhost)?/(?:(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:@&=])*)(?:/(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[?:@&=])))))|(?:prospero://(?:(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?)/(?:(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[?:@&=])*)(?:/(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[?:@&=]))))(?:(?:;(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[?:@&]))=(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[?:@&])))))|(?:ldap://(?:(?:(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?))?/(?:(?:(?:(?:(?:(?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID|oid).(?:(?:\d+)(?:.(?:\d+))))(?:(?:%0[Aa])?(?:%20))=(?:(?:%0[Aa])?(?:%20)))?(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))*))(?:(?:(?:%0[Aa])?(?:%20)*)+(?:(?:%0[Aa])?(?:%20)*)(?:(?:(?:(?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID|oid).(?:(?:\d+)(?:.(?:\d+))*))(?:(?:%0[Aa])?(?:%20)*)=(?:(?:%0[Aa])?(?:%20)*))?(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))))))(?:(?:(?:(?:%0[Aa])?(?:%20))(?:[;,])(?:(?:%0[Aa])?(?:%20)))(?:(?:(?:(?:(?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID|oid).(?:(?:\d+)(?:.(?:\d+))))(?:(?:%0[Aa])?(?:%20))=(?:(?:%0[Aa])?(?:%20)))?(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))*))(?:(?:(?:%0[Aa])?(?:%20)*)+(?:(?:%0[Aa])?(?:%20)*)(?:(?:(?:(?:(?:[a-zA-Z\d]|%(?:3\d|[46][a-fA-F\d]|[57][Aa\d]))|(?:%20))+|(?:OID|oid).(?:(?:\d+)(?:.(?:\d+))*))(?:(?:%0[Aa])?(?:%20)*)=(?:(?:%0[Aa])?(?:%20)*))?(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2})))))))(?:(?:(?:%0[Aa])?(?:%20))(?:[;,])(?:(?:%0[Aa])?(?:%20)))?)(?:\?(?:(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))+)(?:,(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))+)))?)(?:\?(?:base|one|sub)(?:\?(?:((?:[a-zA-Z\d$-.+!*'(),;/?:@&=]|(?:%[a-fA-F\d]{2}))+)))?)?)?)|(?:(?:z39.50[rs])://(?:(?:(?:(?:(?:a-zA-Z\d*[a-zA-Z\d])?).)*(?:a-zA-Z*[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?)(?:/(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))+)(?:+(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))+))(?:\?(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))+))?)?(?:;esn=(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))+))?(?:;rs=(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))+)(?:+(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))+)))?))|(?:cid:(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[;?:@&=])))|(?:mid:(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[;?:@&=]))(?:/(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[;?:@&=])))?)|(?:vemmi://(?:(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?)(?:/(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[/?:@&=])*)(?:(?:;(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[/?:@&]))=(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[/?:@&])*))*))?)|(?:imap://(?:(?:(?:(?:(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[&=~])+)(?:(?:;[Aa][Uu][Tt][Hh]=(?:*|(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[&=~])+))))?)|(?:(?:;[Aa][Uu][Tt][Hh]=(?:*|(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[&=~])+)))(?:(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[&=~])+))?))@)?(?:(?:(?:(?:(?:a-zA-Z\d[a-zA-Z\d])?).)(?:a-zA-Z[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?))/(?:(?:(?:(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[&=~:@/])+)?;[Tt][Yy][Pp][Ee]=(?:Ll))|(?:(?:(?:(?:[a-zA-Z\d$-.+!'(),]|(?:%[a-fA-F\d]{2}))|[&=~:@/])+)(?:\?(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[&=~:@/])+))?(?:(?:;[Uu][Ii][Dd][Vv][Aa][Ll][Ii][Dd][Ii][Tt][Yy]=(?:[1-9]\d)))?)|(?:(?:(?:(?:[a-zA-Z\d$-_.+!'(),]|(?:%[a-fA-F\d]{2}))|[&=~:@/])+)(?:(?:;[Uu][Ii][Dd][Vv][Aa][Ll][Ii][Dd][Ii][Tt][Yy]=(?:[1-9]\d)))?(?:/;[Uu][Ii][Dd]=(?:[1-9]\d))(?:(?:/;[Ss][Ee][Cc][Tt][Ii][Oo][Nn]=(?:(?:(?:[a-zA-Z\d$-.+!*'(),]|(?:%[a-fA-F\d]{2}))|[&=~:@/])+)))?)))?)|(?:nfs:(?:(?://(?:(?:(?:(?:(?:a-zA-Z\d*[a-zA-Z\d])?).)*(?:a-zA-Z*[a-zA-Z\d])?))|(?:(?:\d+)(?:.(?:\d+)){3}))(?::(?:\d+))?)(?:(?:/(?:(?:(?:(?:(?:[a-zA-Z\d\$-.!~'(),])|(?:%[a-fA-F\d]{2})|[:@&=+]))(?:/(?:(?:(?:[a-zA-Z\d\$-.!~*'(),])|(?:%[a-fA-F\d]{2})|[:@&=+])*))*)?)))?)|(?:/(?:(?:(?:(?:(?:[a-zA-Z\d\$-.!~'(),])|(?:%[a-fA-F\d]{2})|[:@&=+]))(?:/(?:(?:(?:[a-zA-Z\d\$-.!~*'(),])|(?:%[a-fA-F\d]{2})|[:@&=+])*))*)?))|(?:(?:(?:(?:(?:[a-zA-Z\d\$-.!~'(),])|(?:%[a-fA-F\d]{2})|[:@&=+]))(?:/(?:(?:(?:[a-zA-Z\d\$-_.!~'(),])|(?:%[a-fA-F\d]{2})|[:@&=+])))*)?)))
                    

Usage

Replacement

Advice

Resources

Regular Expressions.info Rubular Regular Expression Editor