Hello,
Actually this is a follow-up of my earlier request to identify Sentence Boundaries while generating snippets for a search engine. The basic regex I have written to delimit sentence boundaries handles numbers and acronyms but I cannot get it to handle cases of
The full stops after Mr. Mrs. are automatically treated as sentence delimiters which is not desirable.
I tried the following syntax:
to make the regex ignore a full-stop after such cases enumerated, but it does not work.
In fact the simple regex I had written has got murky and just does not perform any more.
Any help in correcting the regex would be appreciated.
Some sample sentences are given below:
Actually this is a follow-up of my earlier request to identify Sentence Boundaries while generating snippets for a search engine. The basic regex I have written to delimit sentence boundaries handles numbers and acronyms but I cannot get it to handle cases of
Quote:
Mr. Andrew visited me. Mrs. Smith left for London. |
I tried the following syntax:
Code:
!(Dr\.|Mr\.|Mrs\.|Ms\.|[A-Z]\.|i\.e\.|w\.r\.t\.|e\.g\.|etc\.|viz\.)
In fact the simple regex I had written has got murky and just does not perform any more.
Any help in correcting the regex would be appreciated.
Some sample sentences are given below:
Quote:
Mr. Andrew came. Ms. Smith left for London. He brought three things viz. bread, cheese and wine This is w.r.t. your application |