Top Sifter Filter Mistakes

Follow

1. Using “contains” instead of a word match

The contains operator looks for partial matches in text. This is sometimes confused with the standard word or text matches. Frequently this operator is used instead of the hashtag operator (#).

 

Example 1

Original Filter

contains:informatics

Problem

It is unlikely that “informatics” will be tweeted as part of a larger single word. The user really intends to match on the word by itself.

Correct Filter

informatics

 

Example 2

Original Filter

contains:#MMIW

Problem

The user intends to match on the hashtag, not on words that have #MMIW embedded within them.

Correct Filter

#MMIW

 

2.  Only Negative Expressions

Twitter needs something positive to search by. Use negative expressions to limit the results that are matched by the positive search terms.

 

Example 1

Original Filter

-(-contains:glutened)

Problem

Even though the double-negatives cancel out, this expression has no single positive element.

Correct Filter

contains:glutened

 

Example 2

Original Filter

-Shahbag -#Shahbag

Problem

Twitter will match every single tweet except the relative few that contain the word Shahbag, but not the hashtag @Shahbag and vice-versa.

In fact, the following tweet would be missed because of the way that the two terms are not OR’d with each other:

evrybdy missd in shahbag. 2 mny ppl dead already #jamaat. #Rajib #shahbag

Correct Filter

Shahbag OR #Shahbag

 

3.  Too Many Expressions

Twitter will not accept searches with more than 29 positive or 49 negative clauses.

 

Example

Original Filter

#PlayingChicken OR #FSW2014 OR #FSW OR #Campylobacter OR #campy OR #Bacteria OR #FoodPoisoning OR #SpreadsInfection OR #WashingChicken OR #foodsafetyweek OR #foodsafety OR #rawmeat OR #rawchicken OR "wash raw chicken" OR "wash chicken" OR "wash raw poultry" OR "wash poultry" OR "washing raw chicken" OR "washing chicken" OR "washing raw poultry" OR "washing poultry" OR "food standards agency" OR "food standards" OR "spread the word not the germs" OR "campylobacter poisoning" OR "what's going on in your kitchen" OR "food safety week" OR "food safety" OR "Food Poisoning" OR "FSA" OR @foodgov OR contains:chicken food borne illness OR contains:chicken germs OR contains:food bugs OR contains:chicken bacteria OR url_contains:"https://www.youtube.com/watch?v=KsX1GWA3eFw" OR url_contains:"https://www.youtube.com/watch?v=0svUVR_fATM" OR activity_url_contains:"food.gov.uk"

Problem

Among other problems, this query contains too many positive clauses. Twitter has set a limit on searches like this and will reject it.

Correct Filter

The query must be separated into one or more smaller queries. If necessary, the Twitter data from multiple queries can be analyzed together as one data set within DiscoverText.

Have more questions? Submit a request

Comments

Powered by Zendesk