PHP Tricks to Identifying Duplicate Words in a String

PHP Tricks to Identifying Duplicate Words in a String

Hello friends, welcome today’s post on PHP. Today we are going to learn how to identify duplicate words in a string. Most of the time we faced duplicate words issue and find a convenient solution to remove those words. It’s very simple but need few tricks to solve it. Following tricks are very easy to use and also very efficient to identify duplicate words. Let’s go-

– Step 01:

The first task here is to identify the individual words in the sentence or paragraph. You accomplish this by compressing multiple spaces in the string

– Step 02:

Decomposing the sentence into words with explode(), using a single space as [the] delimiter.

– Step 03:

Next, a new associative array, $wordStats , is initialized and a key is created within it for every word in the original string. If a word occurs more than once, the value corresponding to that word’s key in the $wordStats array is incremented by 1.

– Step 04:

Once all the words in the string have been processed, the $wordStats array will contain a list of unique words from the original string, together with a number indicating each word’s frequency. It is now a simple matter to isolate those keys with values greater than 1, and print the corresponding words as a list of duplicates.


// define string
$string = "hello world nice hello world";
// trim the whitespace at the ends of the string
$string = trim($string);

// compress the whitespace in the middle of the string
$string = ereg_replace('[[:space:]]+', ' ', $string);

// decompose the string into an array of "words"
$words  = explode(' ', $string); 

// iterate over the array
// count occurrences of each word
// save stats to another array
foreach ($words as $word) {
// print all duplicate words
// result: "hello world "

foreach ($wordStats as $k=>$v) {

	if ($v >= 2) { print "$k 
"; }

Important Note:

ereg_* functions have been deprecated and will throw errors in PHP versions 5.3 and newer. Therefore they should not be used (and new tutorials should not be promoting them). Use preg_* instead.

Thanks John Conde



Share It

Related Posts