Regular Expressions (Regex)
Overview
Bot Libre supports using Regular Expressions in patterns, templates, and scripts.
Regular Expressions, or Regex defines a pattern syntax for parsing text. Unlike AIML and Bot Libre patterns Regex patterns are character based, not word base, so can match specific types of words and word sequences such as numbers, dates, times, currency, and others.
For example, the following regex matches a number,
/\d+
and this regex would match a date,
/^(19|20)\d\d[-/.](0[1-9]|1[012])[-/.](0[1-9]|[12][0-9]|3[01])$
Bot Libre allows regex expressions to be used in AIML patterns, and in Bot Libre response patterns. Bot Libre's scripting language Self also allows regex in patterns and provides extractor functions that allow regex to be used to extract data from a user's input.
To define a regex pattern in an AIML or Bot Libre pattern just start the regex with the "/" character.
AIML
AIML defines pattern wildcards such as * and ^ which can match multiple words in a phrase, but they will match any word, and are not restricted to specific types of words. Bot Libre lets you include regex inside AIML patterns to match specific types of words. Just like the * wildcard the word that was matched by the regex can be accessed in the template using the <star/> tag.
<category>
<pattern>my email is /.+\@.+\..+</pattern>
<template>Okay, I will email you at <star/></template>
</category>
Normally regex is used to match a specific word, but you can also use regex to match and entire phrase if it defines the entire pattern.
For this to work the entire pattern must be the regex, and the pattern can have no other words. The "()" characters in regex define a group which becomes the star variable(s).
<category>
<pattern>/(?i)what\sis\s(.*)</pattern>
<template>I have no idea what <star/> is.</template>
</category>
Patterns
Patterns and regex can also be used in Bot Libre response lists similar to AIML.
Pattern("my email is /.+\@.+\..+")
Template("Okay, I will email you at {star}")
Pattern("/(?i)what\sis\s(.*)")
Template("I have no idea what {star} is.")
In a response list template you can also use Self extractor functions.
I am 22 years old Template("I will remember that you are { var age = sentence.exec("\d+"); speaker.age = age; age } years old.")
Self
Regex can also be used in Self patterns and functions.
state Math { pattern "^ /\d+ \* /\d+ ^" template "{star[1].toNumber() * star[2].toNumber()}"; pattern "^ /\d+ / /\d+ ^" template "{star[1].toNumber() / star[2].toNumber()}"; pattern "^ /\d+ \+ /\d+ ^" template "{star[1].toNumber() + star[2].toNumber()}"; pattern "^ /\d+ \- /\d+ ^" template "{star[1].toNumber() - star[2].toNumber()}"; }
The following are regex functions in Self:
- Utils.matches(text, regex) - return if the regex matches the text
Utils.matches("12345", "\d+") == true
- String.test(text) - return if the regex string it matches the text
"\d".test("hello 123") == true
- String.exec(text) - extract the subtext matching the regex string from the text
"\d+".exec("hello 123") != "123"
- String.match(text) - returns an array of all values matching the regex string extracted from the text
var values = "hello 123 456".match("\d+"); values[1] == "456"
Bot Libre also defines several symbols for common regex patterns.
These include:
- #number
- #date
- #url
These symbols can be used in place of regex patterns in Self and patterns.
<category>
<pattern>my email is #email</pattern>
<template>Okay, I will email you at <star/></template>
</category>
state Email { pattern "^ #email ^" topic "email" template "Thank you, I will remember your email. { think { speaker.email = Utils.extract(sentence, #email); conversation.topic = null; } }"; pattern "*" topic "email" template "Please enter a valid email."; }