MySQL Full-Text Index

MySQL Full-Text Index

MySQL 5.7.6 supports full-text index for Chinese, Japanese, and Korean (CJK).

The built-in MySQL full-text parser uses the white space between words as a delimiter to determine where words begin and end, which is a limitation when working with ideographic languages that do not use word delimiters. To address this limitation, MySQL provides an ngram full-text parser that supports Chinese, Japanese, and Korean (CJK). The ngram full-text parser is supported for use with InnoDB and MyISAM.

Create a Full-Text Index

Creating a FULLTEXT Index that Uses the ngram Parser

CREATE FULLTEXT INDEX content_fulltext ON table_name(column1, column2,...) with parser ngram;

Full-Text Searches

Natural Language Full-Text Searches

SELECT COUNT(*) FROM articles
WHERE MATCH (title,body) AGAINST ('database' IN NATURAL LANGUAGE MODE);
-- or
SELECT COUNT(*) FROM articles
WHERE MATCH (title,body) AGAINST ('database');

the columns named in the MATCH() function (title and body) are the same as those named in the definition of the article table’s FULLTEXT index. To search the title or body separately, you would create separate FULLTEXT indexes for each column.

Boolean Full-Text Searches

-- retrieves all the rows that contain the word “MySQL” but that do not contain the word “YourSQL”
SELECT * FROM articles WHERE MATCH (title,body)
AGAINST ('+MySQL -YourSQL' IN BOOLEAN MODE);
  • + stands for AND
  • - stands for NOT
  • [no operator] implies OR. The word is optional, but the rows that contain it are rated higher.
  • “”. A phrase that is enclosed within double quote (") characters matches only rows that contain the phrase literally, as it was typed.

References

[1] ngram Full-Text Parser - MySQL 5.7 Documentation

[2] MySQL 全文索引

[3] MySql5.6全文索引 及 5.7 中文索引插件ngram