Pattern Matching
Overview
xTuple ERP supports the use of Regular Expressions (also known as "Regex") in fields where pattern matching is called for. This Regular Expression support gives you tremendous flexibility and control whenever you want to retrieve unique patterns from within any of the following groupings:
- Class Codes
- Planner Codes
- Item Groups
- Product Categories
- Customer Types
- Customer Groups
- Vendor Types
Let's say, for example, you want to generate a report showing Internet sales during a given period. To access this data, you would need to look at sales made to your Internet Customers. Because Internet Customers are a subset of Customer Types, you would create a Regular Expression to retrieve this subset and send it to the report.
Characters and Metacharacters
Regular Expressions are created using different combinations of characters and metacharacters. A character is defined as any alphanumeric character—both upper and lower case—including punctuation marks, white space, and other keyboard symbols. Metacharacters, sometimes referred to as " wildcards," are special characters used to facilitate pattern matching. The most common metacharacters are described in the tables below.
The more you understand the role of metacharacters, the more you will be able to control your pattern matching.
The first thing to understand is that Regular Expressions match " substrings." A substring is a subset of a " string"--a string being a sequence of characters arranged in a line. For example, the numbers 123 are a substring of the string 012345. Similarly, the letters EAT are a substring of the strings MEAT, EATERY, and THEATER. Numbers and letters can also be combined to form a substring. The pattern INTCUST4 is a substring of a particular set of Internet Customer strings: INTCUST400, INTCUST401, INTCUST402, and so on. As you can see, substrings may appear anywhere in a string—at the beginning, the middle, or the end.
Regular Expressions are case-sensitive. This means there is a difference between "a" and "A". Keep this in mind when using Regular Expressions for pattern matching.
Metacharacters give you even more control over your substring definitions. With metacharacters, you can specify the exact location of a substring: beginning of a word or line, end of a word or line, etc. You can also specify ranges of data: Customers A-Z, Items 1-9, etc. This sort of control is especially vital when searching through large quantities of data. As you can imagine, metacharacters will not only save you time, they will also increase your precision. The following tables describe metacharacters in more detail.
|
Single Character Metacharacters |
|
|
. |
Matches any single character. |
|
[...] |
Matches any single character listed between the brackets. |
|
[^...] |
Matches any single character except those listed between the brackets. |
xTuple ERP supports pattern matching with Regular Expressions in accordance with the Portable Operating System Interface (POSIX) standard.
|
Quantifiers |
|
|
? |
Matches the preceding element zero or one time. |
|
* |
Matches the preceding element zero or more times. |
|
+ |
Matches the preceding element one or more times. |
| | |
Operates as a choice between alternatives, equivalent to "or". Example: * The expression abc|def would match "abc" or "def". |
|
{num} |
Matches the preceding element num times. |
|
{min,max} |
Matches the preceding element at least min times, but not more than max times. |
A space between characters is considered a character itself. You should avoid using spaces when writing Regular Expressions, unless the pattern you are matching expressly calls for them.
|
Anchors |
|
|
^ |
Matches at the start of the line. |
|
$ |
Matches at the end of the line. |
Build Regular Expressions using trial and error. If your first attempt doesn"t yield the desired results, modify the expression and try again.
Hierarchical Structure
To simplify your pattern matching efforts, you should organize your groupings according to a hierarchical structure. Again, the following groupings support pattern matching using Regular Expressions:
- Class Codes
- Planner Codes
- Item Groups
- Product Categories
- Customer Types
- Customer Groups
Let's consider the Customer Type grouping to illustrate this point about hierarchies. If your Customer Types have been arranged hierarchically, the naming convention will exhibit a logical, sequential order. The following list shows an orderly, hierarchical arrangement of Customer Types:
- CUSTUSA100, 101, 102, ...
- CUSTEUROPE100, 101, 102, ...
- CUSTASIA100, 101, 102, ...
Regular Expressions will find and match any pattern. But writing them will be easier if you arrange your groupings hierarchically, as shown in the example above. For more detailed information about pattern matching there are numerous websites available, including this one.
