re2-0.3: Bindings to the re2 regular expression library

LicenseMIT
Safe HaskellNone
LanguageHaskell98

Regex.RE2

Contents

Description

re2 is a regular expression library offering predictable run-time and memory consumption. This package is a binding to re2.

Supported expression syntax is documented at http://code.google.com/p/re2/wiki/Syntax.

$ ghci -XOverloadedStrings
ghci> import Regex.RE2

ghci> find "\\w+" "hello world"
Just (Match [Just "hello"])

ghci> find "\\w+$" "hello world"
Just (Match [Just "world"])

ghci> find "^\\w+$" "hello world"
Nothing
Synopsis

Compiling patterns

data Pattern #

A pattern is a compiled regular expression plus its compilation options.

Patterns can be created by calling compile explicitly:

import Data.ByteString.Char8 (pack)

p :: Pattern
p = case compile (pack "^hello world$") of
        Right ok -> ok
        Left err -> error ("compilation error: " ++ errorMessage err)
 

Or by using the IsString instance:

import Data.String (fromString)

p :: Pattern
p = fromString "^hello world$"

Or by using the OverloadedStrings language extension:

{-# LANGUAGE OverloadedStrings #-}

p :: Pattern
p = "^hello world$"
Instances
Eq Pattern # 
Instance details

Defined in Regex.RE2

Methods

(==) :: Pattern -> Pattern -> Bool #

(/=) :: Pattern -> Pattern -> Bool #

Show Pattern # 
Instance details

Defined in Regex.RE2

IsString Pattern # 
Instance details

Defined in Regex.RE2

Methods

fromString :: String -> Pattern #

compileWith :: Options -> ByteString -> Either Error Pattern #

Compile a regular expression with the given options. If compilation fails, the error can be inspected with errorMessage and errorCode.

Use optionEncoding to select whether the input bytes should be interpreted as UTF-8 or Latin1. The default is UTF8.

Pattern properties

patternInput :: Pattern -> ByteString #

The regular expression originally provided to compileWith.

patternOptions :: Pattern -> Options #

The options originally provided to compileWith.

patternGroups :: Pattern -> Vector (Maybe ByteString) #

The capturing groups defined within the pattern. Groups are listed from left to right, and are Nothing if the group is unnamed.

ghci> patternGroups "(\\d+)|(?P<word>\\w+)"
fromList [Nothing,Just "word"]

Options

data Options #

Options controlling how to compile a regular expression. The fields in this value may be set using record syntax:

compileNoCase :: B.ByteString -> Either Error Pattern
compileNoCase = compileWith (defaultOptions { optionCaseSensitive = False })
 
Instances
Eq Options # 
Instance details

Defined in Regex.RE2

Methods

(==) :: Options -> Options -> Bool #

(/=) :: Options -> Options -> Bool #

Show Options # 
Instance details

Defined in Regex.RE2

defaultOptions :: Options #

defaultOptions = Options
        { optionEncoding = EncodingUtf8
        , optionPosixSyntax = False
        , optionLongestMatch = False
        , optionMaxMemory = 8388608  -- 8 << 20
        , optionLiteral = False
        , optionNeverNewline = False
        , optionDotNewline = False
        , optionNeverCapture = False
        , optionCaseSensitive = True
        , optionPerlClasses = False
        , optionWordBoundary = False
        , optionOneLine = False
        }
 

data Encoding #

Instances
Eq Encoding # 
Instance details

Defined in Regex.RE2

Show Encoding # 
Instance details

Defined in Regex.RE2

optionPerlClasses :: Options -> Bool #

Only checked in posix mode

optionWordBoundary :: Options -> Bool #

Only checked in posix mode

optionOneLine :: Options -> Bool #

Only checked in posix mode

Compilation errors

data Error #

Instances
Eq Error # 
Instance details

Defined in Regex.RE2

Methods

(==) :: Error -> Error -> Bool #

(/=) :: Error -> Error -> Bool #

Show Error # 
Instance details

Defined in Regex.RE2

Methods

showsPrec :: Int -> Error -> ShowS #

show :: Error -> String #

showList :: [Error] -> ShowS #

Matching

data Match #

A successful match of the pattern against some input. Capturing groups may be retrieved with matchGroup or matchGroups.

Instances
Eq Match # 
Instance details

Defined in Regex.RE2

Methods

(==) :: Match -> Match -> Bool #

(/=) :: Match -> Match -> Bool #

Show Match # 
Instance details

Defined in Regex.RE2

Methods

showsPrec :: Int -> Match -> ShowS #

show :: Match -> String #

showList :: [Match] -> ShowS #

matchGroup :: Match -> Int -> Maybe ByteString #

The capturing group with the given index, or Nothing if the group was not set in this match.

The entire match is group 0.

matchGroups :: Match -> Vector (Maybe ByteString) #

All of the groups in the pattern, with each group being Nothing if it was not set in this match. Groups are returned in the same order as patternGroups.

The entire match is group 0.

data Anchor #

Constructors

AnchorStart 
AnchorBoth 
Instances
Eq Anchor # 
Instance details

Defined in Regex.RE2

Methods

(==) :: Anchor -> Anchor -> Bool #

(/=) :: Anchor -> Anchor -> Bool #

Show Anchor # 
Instance details

Defined in Regex.RE2

match #

Arguments

:: Pattern 
-> ByteString 
-> Int

Start position

-> Int

End position

-> Maybe Anchor 
-> Int

How many match groups to populate

-> Maybe Match 

The most general matching function. Attempt to match the pattern to the input within the given constraints.

If the number of match groups to populate is 0, matching can be performed more efficiently.

Searching

find :: Pattern -> ByteString -> Maybe Match #

Attempt to find the pattern somewhere within the input.

Replacing

replace #

Arguments

:: Pattern 
-> ByteString

Input

-> ByteString

Replacement template

-> (ByteString, Bool) 

Replace the first occurance of the pattern with the given replacement template. If the template contains backslash escapes such as \1, the capture group with the given index will be inserted in their place.

Returns the new bytes, and True if a replacement occured.

replaceAll #

Arguments

:: Pattern 
-> ByteString

Input

-> ByteString

Replacement template

-> (ByteString, Int) 

Replace every occurance of the pattern with the given replacement template. If the template contains backslash escapes such as \1, the capture group with the given index will be inserted in their place.

Returns the new bytes, and how many replacements occured.

extract #

Arguments

:: Pattern 
-> ByteString

Input

-> ByteString

Extraction template

-> Maybe ByteString 

Attempt to find the pattern somewhere within the input, and extract it using the given template. If the template contains backslash escapes such as \1, the capture group with the given index will be inserted in their place.

Returns Nothing if the pattern was not found in the input.

Utility functions

quoteMeta :: ByteString -> ByteString #

Escapes bytes such that the output is a regular expression which will exactly match the input.