Principle Project Parts
Every project in DiscoverText has three fundamental components:
- Archives: collections of raw data.
- Buckets: sub-sets of raw data.
- Datasets: data humans can code and machines can classify.
Raw data archives can come from a variety of sources including uploaded spreadsheets, large email collections, or live social media feeds.
Buckets are produced using a variety of tools, including search, filters, coding, de-duplication, clustering, and machine classification. Buckets are refined archives.
Anything you can get on a list view you can put in a bucket. Any duplicate group or set of groups can go in a bucket.
Datasets can be coded (labeled or tagged) by one or more DiscoverText users. They can also be machine classified using our "sifter" technology.