PRFilter Shows Press Release Buzzword Abuse Still Prevalent

prfilter logoPress release search engine PRFilter came out of beta last week. It aggregates press releases from 10 wire services and distribution services and pulls them in directly from 60+ technology and media companies. Users can filter by specific industries, countries and time periods and create personalized accounts for further customization.

It seems to work fairly well. The user interface is simple and easy to use and the sample queries I tried brought up decent results. Press releases from PR Newswire, Business Wire, PRWeb and Marketwire frequently appear in the listings. PitchEngine releases do not appear to be in the index.

While testing it out I thought I’d revisit the press release buzzwords research I did last year. So I pulled the top 25 overused terms from that post and ran them through PRFilter.

UPDATE:
PRFilter’s Adam Parker was kind enough to pull more accurate data on the terms (see the comments below for information on why my original figures were incomplete).

Here are the updated results, covering 3,000 press releases in a 24 hour period. I’ve left my original table and the associated analysis intact below.

 Buzzword / Overused TermMatches (24 hour period)
1leading776
2solution622
3best473
4innovate / innovative / innovator452
5leader410
6top370
7unique282
8great245
9extensive215
10leading provider153
11exclusive143
12premier136
13flexible119
14award winning / winner106
15dynamic95
16fastest70
17smart69
18state of the art65
19cutting edge54
20biggest54
21easy to use51
22largest34
23real time8

In this sampling of releases “leading” edges out “solution” as the most overused term.

–original post resumed–

Here are the matches for each term in the past 24 hours:

 Buzzword / Overused TermMatches (Past 24 hours)
1solution243
2leading provider217
3leading116
4award winning84
5real-time59
6best52
7state of the art45
8cutting edge35
9leader31
10smart25
11unique21
12flexible18
13innovative17
14innovator17
15dynamic17
16innovation16
17extensive16
18premier15
19fastest15
xbiggest0
xeasy to use0
xexclusive0
xgreat0
xlargest0
xtop0

“Solution” was the most overused term but that figure is skewed by regular usage of that word. “Leading provider” also suffered considerable abuse which is a shame because that one really is devoid of meaning.

It was good to see that six overused terms from my full list did not have any instances in the past 24 hours, but the holiday weekend in the US slowed down press release activity.

A few notes on searching with PRFilter:

  • The index gets updated very quickly. Checking some of the terms even 15 minutes later resulted in more matches.
  • Using quotation marks to limit the results to exact matches for multiple word queries (e.g. “cutting edge” instead of cutting edge) did not seem to work.
  • There was a lot of overlap in the results for “innovative,” “innovator” and “innovation.” While I wanted exact matches for this list combining those results does make sense for the average searcher.

For more information on PRFilter see the launch press release and this video:

Comments

  1. says

    Hi Adam

    Thanks for testing out PRFilter and glad to hear that you generally found that it delivered. Just to pick up on some of your specific queries and observations.

    Pitch Engine – You are correct that Pitch Engine isn’t indexed currently. We are adding new sources all the time and we will try and accelerate their inclusion.

    6 missing terms – I am afraid I have some bad news regarding the terms that you couldn’t find. It isn’t because they don’t appear; in fact potentially it’s the opposite.

    As PRFilter is designed to find relevant press releases we saw no point in tokenising and indexing terms that appeared frequently in articles but don’t add anything in terms of relevance e.g. a, the, of etc. I’m afraid that the ones you refer to were in that list. In fact until a week ago so was leading – read into that what you will :-) – it was taken out because someone on Twitter expressed an interest in this very topic and particularly that term so I thought I would see what happened if we allowed it through.

    I have now taken out the other 6 terms which didn’t appear on your list. This will make the system have to work a little harder as they potentially appear quite a bit but give it a few days and we should be able to give you an answer on these too. Happy to just give you the results and save you doing all those searches if you like?

    Speed of indexing – glad you found the indexing to be very dynamic as it’s meant to be to make sure the news is fresh. We aim to have releases indexed within 10 minutes of publish time and average nearer 5 minutes in around 90% of cases.

    Phrases/quotation marks – using quotation marks *should* produce different results. For instance if you search for cutting edge without quotation marks then releases with instances of – the phrase, both terms or either term – could appear depending on how relevant the system thinks they are whereas quotation marks should only return matches where the terms appear in tandem. However the system does have a built in cut-off for search results such that it stops displaying results that only have a very weak relevance compared to the top results. For terms such as cutting edge this may well result in similar results through each method given the high number of releases that actually *do* include the exact phrase and so are considered significantly more relevant than ones in which just “cutting” appears for instance.

    Overlap – similar to the “6 terms” issue PRFilter is looking for relevance and as you point out most people aren’t looking for terms that generally don’t indicate relevance (as you are) and so it looks at the stem of such words rather than worrying about exact matches.

    The core purpose of PRFilter is to filter releases on a personal basis (as described in the video) so some of the tokenisation quirks above are because you take a slightly different approach to search than filtering. The public search functionality was added in response to requests to open up the system to a wider audience and so we have had to adapt things. We are already working on some upgrades that should marry the two requirements even better.

    Thanks again.

    Adam

  2. says

    Adam, thank you for providing all of this information.

    For the general terms with no matches like “great” and “largest” it actually makes a lot of sense to exclude those as you did. My queries to check for overused terms are certainly not typical and fewer, more relevant results is a better user experience. So I wouldn’t want to advocate for removing those exclusions. But if you do pull data on those particular terms I can update the post.

    In some cases I did see different match counts for multi-word phrases with and without quotations marks. But when I looked through the results for queries with quotation marks I didn’t always see the exact match phrase highlighted. For example in the results for “leading provider” there were some instances of just “leading” or “provider.” So it didn’t appear that queries with quotation marks were bringing up only exact matches. But I better understand now what you are doing in terms of trying to make the results more relevant and useful, so that makes sense.

    The same goes for the overlap in the results for similar terms, that seems like the right approach for users.

    In going back to read through your comment I noticed a PitchEngine release had already been added to your index, so you guys do move fast. :)

    Best of luck with the service; it’s a useful tool.

  3. says

    Thanks Guys,
    This is a cool idea. Surprised the wire services are allowing it, since most of them own the content users of their service send them (and make money by reselling it to third party sites). The Google News workaround is brilliant!

    Keep us posted!
    Best,
    Jason Kintzler, PitcHEngine

Leave a Reply

Your email address will not be published. Required fields are marked *


nine − = 3