Skip to content

RegExpImport

Paweł Salawa edited this page Jan 15, 2018 · 1 revision

Brief

The RegExp import plugin reads data from text files using regular expression.

Files are parsed regardless of new line characters, so the regular expression can match strings being spread across several text lines.

The plugin uses Qt's QRegularExpression class for matching, so you can read Qt documentation about that class to learn more about possible patterns.

Examples

Simple example

We will import into existing table with 3 columns. Table was created earlier by the statement:

CREATE TABLE test_table (col1, col2, col3);

Input text file contents:

xxxxxx3yyyyyy54zzzzzz5aaaaaa23bbbb4cc54565xxxxx yyy333zzzzzzzzzz154aaaaaa35bbb87

Regular expression configured for the RegExp import plugin:

\D*?(\d+)\D+?(\d+)\D+?(\d+)\D*?

The expression says, that we want to match 3 numbers separated by anything that is not a number. All 3 numbers are enclosed in capture groups, so they will be used as the data to import.

In result table test_table will have 3 new rows:

3 54 5
23 4 54565
333 154 35

The last number will be ignored, as our expression expected all 3 numbers to be present, so one number doesn't fulfill the expression.

Example with named group

Use the same data input file, but change the expression to:

\D*?(\d+)\D+?(\d+)(?\D+?)(\d+)\D*?

This expression uses named group, which captures word between 3rd and 4th groups. Then select option "Import only following groups" and type following groups:

3, word, 1

We use "word" as a group, cause we named it earlier. Using name that is not defined in the regular expression pattern will result in validation error and you will be notified about it.

Unnamed groups are referred with index number, starting with 1. The 0 index refers to the entire matched string. Note that in this example we switched group 3 and 1 with each other, so we will import 3rd group into 1st column and 1st group into 3rd column. It's just for showing the possibilities.

You should have the import dialog configured like this: Shot38.png

As a result we will import followin 3 rows:

5 zzzzzz 3
54565 cc 23
35 aaaaaa 333

SQL function options

Description of options that can be defined for the plugin, when executing import():

Option Valid values Description
RegExpImport.Pattern An expression Regular Expression pattern, as described previously on this page.
RegExpImport.GroupsMode all / custom If all, then any RegExp groups declared in the pattern will be used for subsequent columns in the output table. If custom is used, then only groups listedn in RegExpImport.CustomGroupList will be used for importing into subsequent table columns.
RegExpImport.CustomGroupList comma-separated list of group indexes or names List of groups from the pattern to be imported. It's a comma separated list and may contain extra whitespaces, which are ignored.