Quantcast
Channel: UNIX and Linux Forums
Viewing all articles
Browse latest Browse all 16232

How do we write an exception in a Regex.

$
0
0
Hello,
Actually this is a follow-up of my earlier request to identify Sentence Boundaries while generating snippets for a search engine. The basic regex I have written to delimit sentence boundaries handles numbers and acronyms but I cannot get it to handle cases of
Quote:

Mr. Andrew visited me.
Mrs. Smith left for London.
The full stops after Mr. Mrs. are automatically treated as sentence delimiters which is not desirable.
I tried the following syntax:
Code:

!(Dr\.|Mr\.|Mrs\.|Ms\.|[A-Z]\.|i\.e\.|w\.r\.t\.|e\.g\.|etc\.|viz\.)
to make the regex ignore a full-stop after such cases enumerated, but it does not work.
In fact the simple regex I had written has got murky and just does not perform any more.
Any help in correcting the regex would be appreciated.

Some sample sentences are given below:
Quote:

Mr. Andrew came.
Ms. Smith left for London.
He brought three things viz. bread, cheese and wine
This is w.r.t. your application

Viewing all articles
Browse latest Browse all 16232

Trending Articles