The Twitter (Gnip PowerTrack) sample operator can be very useful if you receive an estimate for a very large number of tweets but do not need the entire population of tweets for your research. The sample operator returns a random subset of the data that match a rule rather than the entire set of tweets, which can substantially lower the cost. The sample percent must be an integer value between 1 and 100. This operator applies to the entire rule and requires any OR’d terms be grouped.
The following is an example that returns a sample of 5% of the data.
(cellphone OR smartphone) sample:5
If you require a minimum number (e.g., 1,000) of tweets for each day of data we retrieve, then you must know for sure that there are at least 1000 tweets per day being created by the Twitter community. A way to discover the average number of tweets per day is to do several 1-day samples. Then, to ensure 1,000 per day you would need to have an idea of the average per day and set your sample percentage accordingly.
Please see Twitter's premium operators documentation for more information.