ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • Regular Expression
    컴퓨터/language 2006. 5. 12. 15:00
    Table 4-2. Regular expression metacharacter syntax

    Subexpression

    Matches

    Notes

    General

    ^

    Start of line/string

     

    $

    End of line/string

     

    \b

    Word boundary

     

    \B

    Not a word boundary

     

    \A

    Beginning of entire string

     

    \z

    End of entire string

     

    \Z

    End of entire string (except allowable final line terminator)

     

    .

    Any one character (except line terminator)

     

    [...]

    "Character class"; any one character from those listed

     

    [^...]

    Any one character not from those listed


     

    Alternation and grouping

    (...)

    Grouping (capture groups)


     

    |

    Alternation

     

    (?:re)

    Noncapturing parenthesis

     

    \G

    End of the previous match

     

    \n

    Back-reference to capture group number "n"

     

    Normal (greedy) multipliers

    {m,n}

    Multiplier for "from m to n repetitions"

     

    {m,}

    Multiplier for "m or more repetitions"

     

    {m}

    Multiplier for "exactly m repetitions"

     

    {,n}

    Multiplier for 0 up to n repetitions

     

    *

    Multiplier for 0 or more repetitions

    Short for {0,}

    +

    Multiplier for 1 or more repetitions

    Short for {1,}

    ?

    Multiplier for 0 or 1 repetitions (i.e, present exactly once, or not at all)

    Short for {0,1}

    Reluctant (non-greedy) multipliers

    {m,n}?

    Reluctant multiplier for "from m to n repetitions"

     

    {m,}?

    Reluctant multiplier for "m or more repetitions"

     

    {,n}?

    Reluctant multiplier for 0 up to n repetitions

     

    *?

    Reluctant multiplier: 0 or more

     

    +?

    Reluctant multiplier: 1 or more

     

    ??

    Reluctant multiplier: 0 or 1 times

     

    Possessive (very greedy) multipliers

    {m,n}+

    Possessive multiplier for "from m to n repetitions"

     

    {m,}+

    Possessive multiplier for "m or more repetitions"

     

    {,n}+

    Possessive multiplier for 0 up to n repetitions

     

    *+

    Possessive multiplier: 0 or more

     

    ++

    Possessive multiplier: 1 or more

     

    ?+

    Possessive multiplier: 0 or 1 times

     

    Escapes and shorthands

    \

    Escape (quote) character: turns most metacharacters off; turns subsequent alphabetic into metacharacters

     

    \Q

    Escape (quote) all characters up to \E

     

    \E

    Ends quoting begun with \Q

     

    \t

    Tab character

     

    \r

    Return (carriage return) character

     

    \n

    Newline character


     

    \f

    Form feed

     

    \w

    Character in a word

    Use \w+ for a word;

    \W

    A non-word character

     

    \d

    Numeric digit

    Use \d+ for an integer;

    \D

    A non-digit character

     

    \s

    Whitespace

    Space, tab, etc., as determined by java.lang.Character.isWhitespace( )

    \S

    A nonwhitespace character


     

    Unicode blocks (representative samples)

    \p{InGreek}

    A character in the Greek block

    (simple block)

    \P{InGreek}

    Any character not in the Greek block

     

    \p{Lu}

    An uppercase letter

    (simple category)

    \p{Sc}

    A currency symbol

     

    POSIX-style character classes (defined only for US-ASCII)

    \p{Alnum}

    Alphanumeric characters

    [A-Za-z0-9]

    \p{Alpha}

    Alphabetic characters

    [A-Za-z]

    \p{ASCII}

    Any ASCII character

    [\x00-\x7F]

    \p{Blank}

    Space and tab characters

     

    \p{Space}

    Space characters

    [ \t\n\x0B\f\r]

    \p{Cntrl}

    Control characters

    [\x00-\x1F\x7F]

    \p{Digit}

    Numeric digit characters

    [0-9]

    \p{Graph}

    Printable and visible characters (not spaces or control characters)

     

    \p{Print}

    Printable characters

    Same as \p{Graph}

    \p{Punct}

    Punctuation characters

    One of !"#$%&'( )*+,-./:;<=>?@[\]^_`{|}~

    \p{Lower}

    Lowercase characters

    [a-z]

    \p{Upper}

    Uppercase characters

    [A-Z]

    \p{XDigit}

    Hexadecimal digit characters

    [0-9a-fA-F]

Designed by Tistory.