More doxygen overview updates.

git-svn-id: https://svn.wxwidgets.org/svn/wx/wxWidgets/trunk@52098 c3d73ce0-8a6f-49c7-b76d-6d57e0e08775
This commit is contained in:
Bryan Petty
2008-02-26 00:05:33 +00:00
parent 2bdba54504
commit 877b5c30d6
2 changed files with 156 additions and 217 deletions

View File

@@ -95,8 +95,8 @@ ALIASES += NULL="<b><tt>NULL</tt></b>"
# @wxEmptyString, @wxNullIcon, etc etc # @wxEmptyString, @wxNullIcon, etc etc
# table aliases for 2 and 3 column tables # table aliases for 2 and 3 column tables
ALIASES += beginTable="<center><table class='doctable'>" ALIASES += beginTable="<center><table class='doctable' border='1' cellspacing='0' cellpadding='3'>"
ALIASES += beginInvisibleTable="<center><table class='doctable' border='0'>" ALIASES += beginInvisibleTable="<center><table class='doctable' border='0' cellspacing='0' cellpadding='4'>"
ALIASES += row2col{2}="<tr><td>\1</td> <td>\2</td></tr>" ALIASES += row2col{2}="<tr><td>\1</td> <td>\2</td></tr>"
ALIASES += row2col{3}="<tr><td>\1</td> <td>\2,\3</td></tr>" ALIASES += row2col{3}="<tr><td>\1</td> <td>\2,\3</td></tr>"
ALIASES += row2col{4}="<tr><td>\1</td> <td>\2,\3,\4</td></tr>" ALIASES += row2col{4}="<tr><td>\1</td> <td>\2,\3,\4</td></tr>"
@@ -149,6 +149,11 @@ ALIASES += endWxPerlOnly="\endif"
ALIASES += beginWxPythonOnly="\if WXPYTHON_MANUAL \n <b>wxPython note</b>\n\n" ALIASES += beginWxPythonOnly="\if WXPYTHON_MANUAL \n <b>wxPython note</b>\n\n"
ALIASES += endWxPythonOnly="\endif" ALIASES += endWxPythonOnly="\endif"
# This is to get around a bug in Doxygen that prevents use of '{' or '}'
# inside of any aliased command in parameters.
ALIASES += leftCurly="{"
ALIASES += rightCurly="}"
#--------------------------------------------------------------------------- #---------------------------------------------------------------------------
# Build related configuration options # Build related configuration options

View File

@@ -13,8 +13,6 @@
A <em>regular expression</em> describes strings of characters. It's a pattern A <em>regular expression</em> describes strings of characters. It's a pattern
that matches certain strings and doesn't match others. that matches certain strings and doesn't match others.
@seealso #wxRegEx
@li @ref overview_resyntax_differentflavors @li @ref overview_resyntax_differentflavors
@li @ref overview_resyntax_syntax @li @ref overview_resyntax_syntax
@li @ref overview_resyntax_bracket @li @ref overview_resyntax_bracket
@@ -25,187 +23,160 @@ that matches certain strings and doesn't match others.
@li @ref overview_resyntax_bre @li @ref overview_resyntax_bre
@li @ref overview_resyntax_characters @li @ref overview_resyntax_characters
@seealso
@li #wxRegEx
<hr> <hr>
@section overview_resyntax_differentflavors Different Flavors of REs @section overview_resyntax_differentflavors Different Flavors of Regular Expressions
Regular expressions ("RE''s), as defined by POSIX, come in two Regular expressions (RE), as defined by POSIX, come in two flavors:
flavors: @e extended REs ("EREs'') and @e basic REs ("BREs''). EREs are roughly those <em>extended regular expressions</em> (ERE) and <em>basic regular
of the traditional @e egrep, while BREs are roughly those of the traditional expressions</em> (BRE). EREs are roughly those of the traditional @e egrep,
@e ed. This implementation adds a third flavor, @e advanced REs ("AREs''), basically while BREs are roughly those of the traditional @e ed. This implementation
adds a third flavor: <em>advanced regular expressions</em> (ARE), basically
EREs with some significant extensions. EREs with some significant extensions.
This manual page primarily describes
AREs. BREs mostly exist for backward compatibility in some old programs; This manual page primarily describes AREs. BREs mostly exist for backward
they will be discussed at the #end. POSIX EREs are almost an exact subset compatibility in some old programs. POSIX EREs are almost an exact subset of
of AREs. Features of AREs that are not present in EREs will be indicated. AREs. Features of AREs that are not present in EREs will be indicated.
@section overview_resyntax_syntax Regular Expression Syntax @section overview_resyntax_syntax Regular Expression Syntax
These regular expressions are implemented using These regular expressions are implemented using the package written by Henry
the package written by Henry Spencer, based on the 1003.2 spec and some Spencer, based on the 1003.2 spec and some (not quite all) of the Perl5
(not quite all) of the Perl5 extensions (thanks, Henry!). Much of the description extensions (thanks, Henry!). Much of the description of regular expressions
of regular expressions below is copied verbatim from his manual entry. below is copied verbatim from his manual entry.
An ARE is one or more @e branches, separated by '@b |', matching anything that matches
any of the branches.
A branch is zero or more @e constraints or @e quantified
atoms, concatenated. It matches a match for the first, followed by a match
for the second, etc; an empty branch matches the empty string.
A quantified atom is an @e atom possibly followed by a single @e quantifier. Without a quantifier,
it matches a match for the atom. The quantifiers, and what a so-quantified
atom matches, are:
An ARE is one or more @e branches, separated by "|", matching anything that
matches any of the branches.
A branch is zero or more @e constraints or @e quantified atoms, concatenated.
It matches a match for the first, followed by a match for the second, etc; an
empty branch matches the empty string.
@b * A quantified atom is an @e atom possibly followed by a single @e quantifier.
Without a quantifier, it matches a match for the atom. The quantifiers, and
what a so-quantified atom matches, are:
a sequence of 0 or more matches of the atom @beginTable
@row2col{ <tt>*</tt> ,
A sequence of 0 or more matches of the atom. }
@row2col{ <tt>+</tt> ,
A sequence of 1 or more matches of the atom. }
@row2col{ <tt>?</tt> ,
A sequence of 0 or 1 matches of the atom. }
@row2col{ <tt>{m}</tt> ,
A sequence of exactly @e m matches of the atom. }
@row2col{ <tt>{m\,}</tt> ,
A sequence of @e m or more matches of the atom. }
@row2col{ <tt>{m\,n}</tt> ,
A sequence of @e m through @e n (inclusive) matches of the atom; @e m may
not exceed @e n. }
@row2col{ <tt>*? +? ?? {m}? {m\,}? {m\,n}?</tt> ,
@e Non-greedy quantifiers, which match the same possibilities, but prefer
the smallest number rather than the largest number of matches (see
@ref overview_resyntax_matching). }
@endTable
@b + The forms using @b { and @b } are known as @e bounds. The numbers @e m and
@e n are unsigned decimal integers with permissible values from 0 to 255
inclusive. An atom is one of:
a sequence of 1 or more matches of the atom @beginTable
@row2col{ <tt>(re)</tt> ,
Where @e re is any regular expression, matches for @e re, with the match
captured for possible reporting. }
@row2col{ <tt>(?:re)</tt> ,
As previous, but does no reporting (a "non-capturing" set of
parentheses). }
@row2col{ <tt>()</tt> ,
Matches an empty string, captured for possible reporting. }
@row2col{ <tt>(?:)</tt> ,
Matches an empty string, without reporting. }
@row2col{ <tt>[chars]</tt> ,
A <em>bracket expression</em>, matching any one of the @e chars (see
@ref overview_resyntax_bracket for more details). }
@row2col{ <tt>.</tt> ,
Matches any single character. }
@row2col{ <tt>@\k</tt> ,
Where @e k is a non-alphanumeric character, matches that character taken
as an ordinary character, e.g. @\@\ matches a backslash character. }
@row2col{ <tt>@\c</tt> ,
Where @e c is alphanumeric (possibly followed by other characters), an
@e escape (AREs only), see @ref overview_resyntax_escapes below. }
@row2col{ <tt>@leftCurly</tt> ,
When followed by a character other than a digit, matches the left-brace
character "@leftCurly"; when followed by a digit, it is the beginning of a
@e bound (see above). }
@row2col{ <tt>x</tt> ,
Where @e x is a single character with no other significance, matches that
character. }
@endTable
@b ? A @e constraint matches an empty string when specific conditions are met. A
constraint may not be followed by a quantifier. The simple constraints are as
follows; some more constraints are described later, under
@ref overview_resyntax_escapes.
a sequence of 0 or 1 matches of the atom @beginTable
@row2col{ <tt>^</tt> ,
Matches at the beginning of a line. }
@row2col{ <tt>@$</tt> ,
Matches at the end of a line. }
@row2col{ <tt>(?=re)</tt> ,
@e Positive lookahead (AREs only), matches at any point where a substring
matching @e re begins. }
@row2col{ <tt>(?!re)</tt> ,
@e Negative lookahead (AREs only), matches at any point where no substring
matching @e re begins. }
@endTable
@b {m} The lookahead constraints may not contain back references (see later), and all
parentheses within them are considered non-capturing. A RE may not end with
a sequence of exactly @e m matches of the atom "\".
@b {m,}
a sequence of @e m or more matches of the atom
@b {m,n}
a sequence of @e m through @e n (inclusive)
matches of the atom; @e m may not exceed @e n
@b *? +? ?? {m}? {m,}? {m,n}?
@e non-greedy quantifiers,
which match the same possibilities, but prefer the
smallest number rather than the largest number of matches (see #Matching)
The forms using @b { and @b } are known as @e bounds. The numbers @e m and @e n are unsigned
decimal integers with permissible values from 0 to 255 inclusive.
An atom is one of:
@b (re)
(where @e re is any regular expression) matches a match for
@e re, with the match noted for possible reporting
@b (?:re)
as previous, but
does no reporting (a "non-capturing'' set of parentheses)
@b ()
matches an empty
string, noted for possible reporting
@b (?:)
matches an empty string, without reporting
@b [chars]
a @e bracket expression, matching any one of the @e chars
(see @ref resynbracket_overview for more detail)
@b .
matches any single character
@b \k
(where @e k is a non-alphanumeric character)
matches that character taken as an ordinary character, e.g. \\ matches a backslash
character
@b \c
where @e c is alphanumeric (possibly followed by other characters),
an @e escape (AREs only), see #Escapes below
@b {
when followed by a character
other than a digit, matches the left-brace character '@b {'; when followed by
a digit, it is the beginning of a @e bound (see above)
@b x
where @e x is a single
character with no other significance, matches that character.
A @e constraint matches an empty string when specific conditions are met. A constraint may
not be followed by a quantifier. The simple constraints are as follows;
some more constraints are described later, under #Escapes.
@b ^
matches at the beginning of a line
@b $
matches at the end of a line
@b (?=re)
@e positive lookahead
(AREs only), matches at any point where a substring matching @e re begins
@b (?!re)
@e negative lookahead (AREs only),
matches at any point where no substring matching @e re begins
The lookahead constraints may not contain back references
(see later), and all parentheses within them are considered non-capturing.
An RE may not end with '@b \'.
@section overview_resyntax_bracket Bracket Expressions @section overview_resyntax_bracket Bracket Expressions
A @e bracket expression is a list A <em>bracket expression</em> is a list of characters enclosed in <tt>[]</tt>.
of characters enclosed in '@b []'. It normally matches any single character from It normally matches any single character from the list (but see below). If the
the list (but see below). If the list begins with '@b ^', it matches any single list begins with @c ^, it matches any single character (but see below) @e not
character (but see below) @e not from the rest of the list. from the rest of the list.
If two characters
in the list are separated by '@b -', this is shorthand for the full @e range of If two characters in the list are separated by <tt>-</tt>, this is shorthand
characters between those two (inclusive) in the collating sequence, e.g. for the full @e range of characters between those two (inclusive) in the
@b [0-9] in ASCII matches any decimal digit. Two ranges may not share an endpoint, collating sequence, e.g. <tt>[0-9]</tt> in ASCII matches any decimal digit.
so e.g. @b a-c-e is illegal. Ranges are very collating-sequence-dependent, and portable Two ranges may not share an endpoint, so e.g. <tt>a-c-e</tt> is illegal.
programs should avoid relying on them. Ranges are very collating-sequence-dependent, and portable programs should
To include a literal @b ] or @b - in the avoid relying on them.
list, the simplest method is to enclose it in @b [. and @b .] to make it a collating
element (see below). Alternatively, make it the first character (following To include a literal <tt>]</tt> or <tt>-</tt> in the list, the simplest method
a possible '@b ^'), or (AREs only) precede it with '@b \'. is to enclose it in <tt>[.</tt> and <tt>.]</tt> to make it a collating element
Alternatively, for '@b -', make (see below). Alternatively, make it the first character (following a possible
it the last character, or the second endpoint of a range. To use a literal <tt>^</tt>), or (AREs only) precede it with <tt>@\</tt>. Alternatively, for
@b - as the first endpoint of a range, make it a collating element or (AREs <tt>-</tt>, make it the last character, or the second endpoint of a range. To
only) precede it with '@b \'. With the exception of these, some combinations using use a literal <tt>-</tt> as the first endpoint of a range, make it a collating
@b [ (see next paragraphs), and escapes, all other special characters lose element or (AREs only) precede it with <tt>@\</tt>. With the exception of
their special significance within a bracket expression. these, some combinations using <tt>[</tt> (see next paragraphs), and escapes,
Within a bracket all other special characters lose their special significance within a bracket
expression, a collating element (a character, a multi-character sequence expression.
that collates as if it were a single character, or a collating-sequence
name for either) enclosed in @b [. and @b .] stands for the Within a bracket expression, a collating element (a character, a
sequence of characters of that collating element. multi-character sequence that collates as if it were a single character, or a
@e wxWidgets: Currently no multi-character collating elements are defined. collating-sequence name for either) enclosed in <tt>[.</tt> and <tt>.]</tt>
So in @b [.X.], @e X can either be a single character literal or stands for the sequence of characters of that collating element.
the name of a character. For example, the following are both identical
@b [[.0.]-[.9.]] and @b [[.zero.]-[.nine.]] and mean the same as @e wxWidgets: Currently no multi-character collating elements are defined. So
@b [0-9]. in <tt>[.X.]</tt>, @c X can either be a single character literal or the name
See @ref resynchars_overview. of a character. For example, the following are both identical:
<tt>[[.0.]-[.9.]]</tt> and <tt>[[.zero.]-[.nine.]]</tt> and mean the same as
<tt>[0-9]</tt>. See @ref overview_resyntax_characters.
Within a bracket expression, a collating element enclosed in @b [= and @b =] Within a bracket expression, a collating element enclosed in @b [= and @b =]
is an equivalence class, standing for the sequences of characters of all is an equivalence class, standing for the sequences of characters of all
collating elements equivalent to that one, including itself. collating elements equivalent to that one, including itself.
@@ -219,57 +190,20 @@ the name of a @e character class enclosed in @b [: and @b :] stands for the list
of all characters (not all collating elements!) belonging to that class. of all characters (not all collating elements!) belonging to that class.
Standard character classes are: Standard character classes are:
@beginTable
@row2col{ <tt>alpha</tt> , A letter. }
@b alpha @row2col{ <tt>upper</tt> , An upper-case letter. }
@row2col{ <tt>lower</tt> , A lower-case letter. }
A letter. @row2col{ <tt>digit</tt> , A decimal digit. }
@row2col{ <tt>xdigit</tt> , A hexadecimal digit. }
@b upper @row2col{ <tt>alnum</tt> , An alphanumeric (letter or digit). }
@row2col{ <tt>print</tt> , An alphanumeric (same as alnum). }
An upper-case letter. @row2col{ <tt>blank</tt> , A space or tab character. }
@row2col{ <tt>space</tt> , A character producing white space in displayed text. }
@b lower @row2col{ <tt>punct</tt> , A punctuation character. }
@row2col{ <tt>graph</tt> , A character with a visible representation. }
A lower-case letter. @row2col{ <tt>cntrl</tt> , A control character. }
@endTable
@b digit
A decimal digit.
@b xdigit
A hexadecimal digit.
@b alnum
An alphanumeric (letter or digit).
@b print
An alphanumeric (same as alnum).
@b blank
A space or tab character.
@b space
A character producing white space in displayed text.
@b punct
A punctuation character.
@b graph
A character with a visible representation.
@b cntrl
A control character.
A character class may not be used as an endpoint of a range. A character class may not be used as an endpoint of a range.
@e wxWidgets: In a non-Unicode build, these character classifications depend on the @e wxWidgets: In a non-Unicode build, these character classifications depend on the
@@ -288,13 +222,13 @@ use constraint escapes instead (see #Escapes below).
@section overview_resyntax_escapes Escapes @section overview_resyntax_escapes Escapes
Escapes (AREs only), Escapes (AREs only),
which begin with a @b \ followed by an alphanumeric character, come in several which begin with a <tt>@\</tt> followed by an alphanumeric character, come in several
varieties: character entry, class shorthands, constraint escapes, and back varieties: character entry, class shorthands, constraint escapes, and back
references. A @b \ followed by an alphanumeric character but not constituting references. A <tt>@\</tt> followed by an alphanumeric character but not constituting
a valid escape is illegal in AREs. In EREs, there are no escapes: outside a valid escape is illegal in AREs. In EREs, there are no escapes: outside
a bracket expression, a @b \ followed by an alphanumeric character merely stands a bracket expression, a <tt>@\</tt> followed by an alphanumeric character merely stands
for that character as an ordinary character, and inside a bracket expression, for that character as an ordinary character, and inside a bracket expression,
@b \ is an ordinary character. (The latter is the one actual incompatibility <tt>@\</tt> is an ordinary character. (The latter is the one actual incompatibility
between EREs and AREs.) between EREs and AREs.)
Character-entry escapes (AREs only) exist to make Character-entry escapes (AREs only) exist to make
it easier to specify non-printing and otherwise inconvenient characters it easier to specify non-printing and otherwise inconvenient characters