Unfortunately, the Sifter service and website has been decommissioned as of September 30, 2018. Thanks to our 6,970 users who created 16,128 free estimates from the complete, undeleted history of Twitter between 1/14/2014 and 9/29/2018.
Please contact Twitter for approval of future academic or commercial use cases. If you can get an approved use case from Twitter, we can still help you work with the data inside DiscoverText.
@DiscoverText remains open and is still the top-ranked text analysis platform on the Internet.
All paid jobs prior to the decommissioning will still be honored.
1. Using “contains” instead of a word match
The contains operator looks for partial matches in text. This is sometimes confused with the standard word or text matches. Frequently this operator is used instead of the hashtag operator (#).
It is unlikely that “informatics” will be tweeted as part of a larger single word. The user really intends to match on the word by itself.
The user intends to match on the hashtag, not on words that have #MMIW embedded within them.
2. Only Negative Expressions
Twitter needs something positive to search by. Use negative expressions to limit the results that are matched by the positive search terms.
Even though the double-negatives cancel out, this expression has no single positive element.
Twitter will match every single tweet except the relative few that contain the word Shahbag, but not the hashtag @Shahbag and vice-versa.
In fact, the following tweet would be missed because of the way that the two terms are not OR’d with each other:
evrybdy missd in shahbag. 2 mny ppl dead already #jamaat. #Rajib #shahbag
Shahbag OR #Shahbag
3. Too Many Expressions
Twitter will not accept searches with more than 29 positive or 49 negative clauses.
#PlayingChicken OR #FSW2014 OR #FSW OR #Campylobacter OR #campy OR #Bacteria OR #FoodPoisoning OR #SpreadsInfection OR #WashingChicken OR #foodsafetyweek OR #foodsafety OR #rawmeat OR #rawchicken OR "wash raw chicken" OR "wash chicken" OR "wash raw poultry" OR "wash poultry" OR "washing raw chicken" OR "washing chicken" OR "washing raw poultry" OR "washing poultry" OR "food standards agency" OR "food standards" OR "spread the word not the germs" OR "campylobacter poisoning" OR "what's going on in your kitchen" OR "food safety week" OR "food safety" OR "Food Poisoning" OR "FSA" OR @foodgov OR contains:chicken food borne illness OR contains:chicken germs OR contains:food bugs OR contains:chicken bacteria OR url_contains:"https://www.youtube.com/watch?v=KsX1GWA3eFw" OR url_contains:"https://www.youtube.com/watch?v=0svUVR_fATM" OR activity_url_contains:"food.gov.uk"
Among other problems, this query contains too many positive clauses. Twitter has set a limit on searches like this and will reject it.
The query must be separated into one or more smaller queries. If necessary, the Twitter data from multiple queries can be analyzed together as one dataset within DiscoverText.