informBank

The domain name informbank.com is for sale
I'm not able to support this website any longer, so if you are interested in buying the domain name, click here.

Categories
Health Care
Opinions
Football (Soccer)
Fitness
World Companies
Technology
Space
Faith
Sport
Interesting Facts
Movie Characters
Science
Gadgets
Home Improvement


Search
Search

The Web
This Site





Feeds
RSS Feed
ATOM Feed

Add to Google
Add to MyMSN
Add to MyYahoo
Add to Bloglines
Add to Newsgator
Add to NewsIsFree
Add to Rojo
Add to Kinja
Add to Pluck

How to search exact phrases in large texts using the MySQL Regular Expressions

The MySQL Regular Expressions. Searching faster using both LIKE and REGEXP in the SQL query.

Publisher:JohnLocke
Category:Technology
Date:15 Jul 2006, 14:23 GMT
Comments:Read Comments (1) | Post a comment
Article Rank:
0 / 10
by 0 visitors

Creating a SQL query for searching an exact phrase in a given field(s) in a table is not much harder than the simple query for searching a substring. However this query could be very slow if you are searching through huge amount of text in the database. So, a good solution to this problem is first to eliminate the rows which do not contain the phrase as a substring (which is fast) and then perform the regular expression (REGEXP) only to the others. Here is what I'm talking about:

SELECT * FROM abc_table WHERE t_text LIKE '%search phrase%' AND t_text REGEXP '[[:<:]]search phrase[[:>:]]'

For those of you who have no previous experience with REGEXP in SQL, here are some examples and explanations:

The '[[:<:]]' I used above is used to specify the start of a word and '[[:>:]]' is for the end of a word. As you've seen they are useful to match an exact phrase.

This will match "Connecting to this host" but it won't match "I'm looking for a hosting":

... REGEXP '[[:<:]]host[[:>:]]'

To match only letters and digits, use '[[:alnum:]]'. If you want to allow only the letters from 'e' to 'w' use something like that:

... REGEXP '[e-wE-W]'
(or)
... REGEXP '[^a-dx-zA-DX-Z]'

... and to match the count of these letters:

... REGEXP '[e-wE-W]{1,5}' (between 1 and 5 times the letter between e and w)
... REGEXP '[e-wE-W]{3,}' (more than 3 times)
... REGEXP '[e-wE-W]{2}' (exactly 2 times)
... REGEXP '(host){2}' (exactly 2 times 'host': 'hosthost')

If there are multiple choices, group them with () and separate them using | :

... REGEXP '(http|ftp|irc)'

When you are using characters like ?, +, $, ^, don't forget to escape them using the \ because they are special characters:

... REGEXP '[a-z\?]'

Details about the regular expressions in MySQL are available in the official reference. This table is also very useful to understand the basics of the REGEXP in MySQL.

Back to top

Bookmark this article:

Permalink | Save to del.icio.us | Save to furl | Save to MyWeb 2.0 at Yahoo

Rank this Article:

Comments:

1. karim wrote: (15 Jul 2006, 16:35 GMT)
I'm not an expert in the sql stuff but as far I tested it on my site, it works well.

Post a comment:

Back to top

informBank -
informBank | Profiles | Archive

©2005-2008 informBank. All content in this website is property of informBank except on the pages where is specified otherwise.
RSS Feed:
Contact us at webmaster @ informbank . com