wreck.api

The public API of wreck.

Notes:

  • Apart from passing through nil, this library does minimal argument checking, since the rules for regular expressions vary from platform to platform, and it is a first class requirement that callers be allowed to construct platform specific regular expressions if they wish.
  • As a result, all functions have the potential to throw platform-specific exceptions if the resulting regular expression is syntactically invalid.
  • On the JVM, these will typically be instances of the java.util.regex.PatternSyntaxException class.
  • On JavaScript, these will typically be a js/SyntaxError.
  • Platform specific behaviour is particularly notable for short / empty regular expressions, such as #"{}" (an error on the JVM, fine but nonsensical on JS) and #"{1}" (ironically, fine but nonsensical on the JVM, but an error on JS). 🤡
  • Furthemore, JavaScript performs automatic escaping of the ‘/’ character when a RegExp object is constructed, and (to my knowledge) there is no way to get the original source string back out. This is a problem as wreck is fundamentally dependent on full fidelity regex <-> string round-tripping in order to function, and JavaScript does not appear to support that.

='

(=' _)(=' re1 re2)(=' re1 re2 & more)

Equality for regexes, defined by having equal String representations. This means that equivalent regexes (e.g. #"..." and #".{3}" will not be considered equal.

Notes:

  • Some JavaScript runtimes that ClojureScript runs on correctly implement equality for regexes, but the JVM does not.

alt

(alt & res)

Returns a regex that will match any one of res, via alternation.

Notes:

  • Duplicate elements in res will only appear once in the result.
  • Does not wrap the result in a group, which, because alternation has the lowest precedence in regexes, runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should almost always be preferred.

alt-cg

(alt-cg & res)

alt then cg.

alt-grp

(alt-grp & res)

alt then grp.

alt-ncg

(alt-ncg nm & res)

alt then ncg.

and'

(and' a b)(and' a b s)

Returns an ‘and’ regex that will match a and b in any order, and with the separator regex (if provided) between them. This is implemented as ASB|BSA, which means that A and B must be distinct (must not match the same text).

Notes:

  • May optimise the expression (via de-duplication in alt).
  • Does not wrap the result in a group, which, because alternation has the lowest precedence in regexes, runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should almost always be preferred.

and-cg

(and-cg a b)(and-cg a b s)

and’ then cg.

Notes:

  • Unlike most other -grp fns, this one does not accept any number of res.
  • May optimise the expression (via de-duplication in alt).

and-grp

(and-grp a b)(and-grp a b s)

and’ then grp.

Notes:

  • Unlike most other -grp fns, this one does not accept any number of res.
  • May optimise the expression (via de-duplication in alt).

and-ncg

(and-ncg nm a b)(and-ncg nm a b s)

and’ then ncg.

Notes:

  • Unlike most other -grp fns, this one does not accept any number of res.
  • May optimise the expression (via de-duplication in alt).

cg

(cg & res)

As for grp, but uses a capturing group.

empty?'

(empty?' re)

Is re nil or (=' #"")?

esc

(esc s)

Escapes s (a String) for use in a regex, returning a String. Note that unlike most other fns in this namespace, this one does not support a regex as an input, nor return a regex as an output.

exn

(exn n re)

Returns a regex where re will match exactly n times.

exn-cg

(exn-cg n & res)

cg then exn.

exn-grp

(exn-grp n & res)

grp then exn.

exn-ncg

(exn-ncg nm n & res)

ncg then exn.

grp

(grp & res)

As for join, but encloses the joined res into a single non-capturing group.

join

(join & res)

Returns a regex that is all of the res joined together. Each element in res can be a regex, a String or something that can be turned into a String (including numbers, etc.). Returns nil when no res are provided, or they’re all nil.

n2m

(n2m n m re)

Returns a regex where re will match from n to m times.

n2m-cg

(n2m-cg n m & res)

cg then n2m.

n2m-grp

(n2m-grp n m & res)

grp then n2m.

n2m-ncg

(n2m-ncg nm n m & res)

ncg then n2m.

ncg

(ncg nm & res)

As for grp, but uses a named capturing group named nm. Returns nil if nm is nil or blank. Throws if nm is an invalid name for a named capturing group (alphanumeric only, must start with an alphabetical character, must be unique within the regex).

nom

(nom n re)

Returns a regex where re will match n or more times.

nom-cg

(nom-cg n & res)

cg then nom.

nom-grp

(nom-grp n & res)

grp then nom.

nom-ncg

(nom-ncg nm n & res)

ncg then nom.

oom

(oom re)

Returns a regex where re will match one or more times.

oom-cg

(oom-cg & res)

cg then oom.

oom-grp

(oom-grp & res)

grp then oom.

oom-ncg

(oom-ncg nm & res)

ncg then oom.

opt

(opt re)

Returns a regex where re is optional.

opt-cg

(opt-cg & res)

cg then opt.

opt-grp

(opt-grp & res)

grp then opt.

opt-ncg

(opt-ncg nm & res)

ncg then opt.

or'

(or' a b)(or' a b s)

Returns an ‘inclusive or’ regex that will match a or b, or both, in any order, and with the separator regex (if provided) between them. This is implemented as ASB|BSA|A|B, which means that A and B must be distinct (must not match the same text).

Notes:

  • May optimise the expression (via de-duplication in alt).
  • Does not wrap the result in a group, which, because alternation has the lowest precedence in regexes, runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should almost always be preferred.

or-cg

(or-cg a b)(or-cg a b s)

or’ then cg.

Notes:

  • Unlike most other -grp fns, this one does not accept any number of res.
  • May optimise the expression (via de-duplication in alt).

or-grp

(or-grp a b)(or-grp a b s)

or’ then grp.

Notes:

  • Unlike most other -grp fns, this one does not accept any number of res.
  • May optimise the expression (via de-duplication in alt).

or-ncg

(or-ncg nm a b)(or-ncg nm a b s)

or’ then ncg.

Notes:

  • Unlike most other -grp fns, this one does not accept any number of res.
  • May optimise the expression (via de-duplication in alt).

qot

(qot s)

Quotes s (a String) for use in a regex, returning a regex. Note that unlike most other fns in this namespace, this one does not support a regex as an input.

str'

(str' o)

Returns the String representation of o, with special handling for RegExp objects on ClojureScript in an attempt to correct JavaScript’s APPALLING default stringification.

xor'

(xor' a b)

Returns an ‘exclusive or’ regex that will match a or b, but not both. This is identical to alt called with 2 arguments, and is provided as a convenience for those who might be building up large logic based regexes and would prefer to use more easily understood logical operator names throughout.

Notes:

  • May optimise the expression (via de-duplication in alt).
  • Does not wrap the result in a group, which, because alternation has the lowest precedence in regexes, runs the risk of behaving unexpectedly if the result is then combined with further regexes. tl;dr - one of the grouping variants should almost always be preferred.

xor-cg

(xor-cg a b)

xor’ then cg.

Notes:

  • Unlike most other -grp fns, this one does not accept any number of res.
  • May optimise the expression (via de-duplication in alt).

xor-grp

(xor-grp a b)

xor’ then grp.

Notes:

  • Unlike most other -grp fns, this one does not accept any number of res.
  • May optimise the expression (via de-duplication in alt).

xor-ncg

(xor-ncg nm a b)

xor’ then ncg.

Notes:

  • Unlike most other -grp fns, this one does not accept any number of res.
  • May optimise the expression (via de-duplication in alt).

zom

(zom re)

Returns a regex where re will match zero or more times.

zom-cg

(zom-cg & res)

cg then zom.

zom-grp

(zom-grp & res)

grp then zom.

zom-ncg

(zom-ncg nm & res)

ncg then zom.