Non-trivial regular expressions use certain special constructs so that they can match more than one string. For example, the regular expression canine|feline matches either the string canine or the string feline.
Data Major uses these regular expression in most places you can enter details. The full list is given below, only some will have actual examples.
| Regular Expressions | |||
| ^ | Match the beginning of a string. | ^Canine | The text STARTS with the word canine |
| $ | Match the end of a string. | Terrier$ | The text ENDS with the phrase terrier |
| de|abc | Match either of the sequences de or abc. | Canine|Feline | Matches EITHER the text Canine OR Feline |
| . | Match any character | ||
| a* | Match any sequence of zero or more a characters. | ||
| a+ | Match any sequence of one or more a characters. | ||
| a? | Match either zero or one a character. | ||
| (abc)* | Match zero or more instances of the sequence abc. | ||
| [a-dX] / [^a-dX] | Matches any character that is (or is not, if ^ is used) either a, b, c, d or X. A - character between two other characters forms a range that matches all characters from the first character to the second. For example, [0-9] matches any decimal digit. | ||
| To use a literal instance of a special character in a regular expression, precede it by two backslash (\) characters. For example, to match the string 1+2 that contains the special + character, 1\\+2 | |||
| Ad-Hoc - Full text Searching | |||||||||||||
|
The full text search can handle
'Boolean Searches' - this just means you have much more control over
what is searched for when compared with the basic 'Clinical Text
Contain' option.
The basic Full Text Search looks for ANY occurrence of the words you enter and will return a 'score' based on what it found. e.g. searching for 'Synulox tabs' look for lines where either word is present. The higher the 'score' the better the match. The other option is the 'Boolean' mode - the system will switch into this mode automatically if any of the following characters are entered ~()<>-+*"in the search criteria.
|
|||||||||||||
| The other Boolean option are listed here for reference, no examples will be given. | |||||||||||||
| () | Parentheses are used to group words into sub expressions. Parenthesized groups can be nested. | ||||||||||||
| ~ |
A leading tilde acts as a negation operator, causing the word's contribution to
the row relevance to be negative. It's useful for marking noise words.
A row that contains such a word will be rated lower than others, but
will not be excluded altogether, as it would be with the - operator. |
||||||||||||
| Operator Type | Examples | Description |
|---|---|---|
| Literal Characters Match a character exactly |
a A y 6 % @ | Letters, digits and many special characters match exactly |
| \$ \^ \+ \\ \? | Precede other special characters with a \ to cancel their regex special meaning |
|
| \n \t \r | Literal new line, tab, return | |
| \cJ \cG | Control codes | |
| \xa3 | Hex codes for any character | |
| Anchors and assertions | ^ | Field starts with |
| $ | Field ends with | |
| [[:<:]] | Word starts with | |
| [[:>:]] | Word ends with | |
| Character groups any 1 character from the group |
[aAeEiou] | any character listed from [ to ] |
| [^aAeEiou] | any character except aAeEio or u | |
| [a-fA-F0-9] | any hex character (0 to 9 or a to f) | |
| . | any character at all | |
| [[:space:]] | any space character (space \n \r or \t) | |
| [[:alnum:]] | any alphanumeric character (letter or digit) | |
| Counts apply to previous element |
+ | 1 or more ("some") |
| * | 0 or more ("perhaps some") | |
| ? | 0 or 1 ("perhaps a") | |
| {4} | exactly 4 | |
| {4,} | 4 or more | |
| {4,8} | between 4 and 8 | |
| Add a ? after any count to turn it sparse (match as few as possible) rather than have it default to greedy | ||
| Alternation | | | either, or |
| Grouping | ( ) | group for count and save to variable |