What is a list segmentation?
At the basic level, it's exactly what it sounds like: breaking your customer or prospect list into smaller segments. These segments can be made in different ways: by age, by gender, etc. For example you can target women and men or Millennials and Baby Boomers separately delivering more relevant offers to each group (segment). Segmentation is a critical step for successful marketing campaigns. For example, 90.7% of US marketers use customer segmentation in their marketing campaigns while segmented email campaigns produce 100.95% higher click rate according to GDMA and MailChimp respectively.
What is a data append?
List or data appending is submitting your customer or prospect list in order to append it with missing data like age, gender or other demographics. At the end, you receive appended list with additional data so you can know your customer better which is critical in marketing and data driven decision making.
Are you a data broker?
No. We don't deal with PII (personally identifiable information) in general and consumer databases in particular. So your customers' / prospects' sensitive information is safe by design because we never ask for it. Instead, Demografy uses machine learning based patent pending technology that simply takes names and estimates demographic data out of them. That is why, unlike traditional solutions, we are able to provide you with 100% coverage (number of records in your list that can be processed).

Data brokers/data append services on the contrary require as much sensitive data about your contacts as possible to try to match individual records in consumer databases. Resulting in jeopardising your contacts' privacy and low coverage since not all records can be matched. They also reduce number of contacts you can provide them with as some businesses simply don't have information like addresses or phones in their lists especially in prospects' lists. Besides this, data append services don’t provide 100% accurate results even for matched records.
Shouldn’t data brokers be 100% accurate?
No, they are not 100% accurate. General conclusion among marketers and data scientists that such services don’t provide 100% accuracy for matched records. Fair data append services never guarantee 100% accuracy and don’t quantify their accuracy because they simply don’t know how accurate they are. Some industry leaders in data append have 70% accuracy according to estimates. Although these services are intended to directly match record in your list with the record in their database(s), there are many challenges that can result in inaccurate match.

For example, the most accurate match input parameter for data append is postal address. However, this attribute is not perfect since only 66.9% of mail is deliverable as addressed and 10.1% of Americans move every year according to NCOA and US Census respectively. As a result, outdated records on both client’s and data provider’s side may lead to inaccurate match. Moreover both provided data and data in third-party databases of data append services may simply be erroneous. Another challenge with third-party databases is that the original data source is unknown. It is just unknown whether purchased data is estimate, self reported or other.
Is Demografy accurate?
For analytics (overall demographic statistics of your list), we have 90-95% accuracy. For data append and list segmentation (demographics detection for each record in your list), accuracy depends on demographic indicator and selected coverage (how many records in your list are processed). You may choose optimal trade-off between accuracy and coverage since lower coverage generally produces higher accuracy. The average accuracies for segmentation and append are as follows: gender - 96% accuracy (at 100% coverage), race - 89% accuracy (at 100% coverage), Hispanic origin - 95% accuracy (at 100% coverage), age range - 70% accuracy (at 80% coverage) / 75% accuracy (at 50% coverage) / 80% accuracy (at 30% coverage), Ethnicity - 82% (at 100% coverage) / 93% (at 52% coverage). You will see individual accuracy estimate for your particular list before purchase.

Please note, we do our best to provide you with as best accuracy as we can. accuracy also heavily depends on your particular list and may show considerably different results than these averages. For example, some test lists show 86% accuracy at 100% coverage for age range detection while others may perform worse than average.
Do you provide accuracy estimate before purchase?
Yes. We are the only solution to provide you with your particular accuracy prediction. We try to be as transparent as possible and give you as much input for decision making as we can. Every list is unique and can perform with different accuracy. And we know how they differ so we provide you with approximate estimate of your predicted accuracy enabling you to decide whether to purchase processed data or not. Estimate is based on known accuracies of lists similar to yours. However please note that prediction is not 100% accurate and performance of your particular list may differ from predicted one in both worse or better way.
What data should I provide?
Only names. First and last name of your contacts. Last name can be even additionally masked with wildcard to completely hide person identity, e.g. John J*son (wildcard mask removes all letters except the first one and last three or less depening on the name length). We are going to add additional indicators in the future which may require extra data (but still non-PII) so you can optionally provide additional data in order to extract more data from your lists later. Additional data also can provide better accuracy in the future. However current indicators require only first and last name. The data is provided as spreadsheet file in either XLSX or CSV format. You will find more information and examples about input data in your dashboard.
Is full name a non-PII?
Yes, it's not PII (personally identifiable information) if last name is masked. If it's not masked, it depends. Generally, fullname is not PII. PII is information that can allow to identify a person. In almost all cases first and last names are not enough to identify a person and it’s required to have extra data like address or birthdate. But depending on personal data regulation it can be considered PII. So in case of stricter personal data regulation or if you or your customers are still concerned, we can provide an extra option. You can upload list containing only first names and masked last names. Masked last name looks like this - John J*son which could be either John Johnson, John Jameson, John Jackson or anybody else. So nobody will ever identify any person in your list. In order to mask last name you can omit any letters in the middle but leave intact first letter and last three letters. All omitted letters are substituted with single wildcard character (*). If last name is too short you can leave only two last letters. If the name is only two letters long you can leave these letters and prepend wildcard in the beginning so it looks like as different name with more letters. Please note: the more your last names are masked the less accurate results you may get for some demographic indicators.
Any best practices for the list being uploaded?
Yes. While most demographic indicators work well with any list, age range is the most challenging indicator to extract from names and it may be impacted by the bias in the list you’re uploading. In order to get the most accurate results for age range segmentation/append you should follow some best practices while preparing your list.

In general, the rule of thumb is to upload all your customers/prospects or get a random subset of them to avoid unintentional age bias. If you can’t upload the whole list, you can ask your IT or data person to get random subset of records from your CRM or database. Or contact us directly for assistance. For example, you should not upload only names starting on specific letter or only a particular group of your customers, etc.
Do you provide discounts for large lists?
Yes. Bigger your list smaller the price. You can see how price differs depending on list size using pricing calculator on PRICING page or calculate your particular list in dashboard.
What data do you detect?
We currently provide analytics and segmentation/append for Gender (Male, Female), Age Range (18-39, 40-59, 60+), Race (White Americans, African Americans, Asian Americans, Native Americans, Two or more races), Hispanic origin (Hispanic, non-Hispanic) and Ethnicity (British, Germanic, Hispanic, African, Italian, East European, French, Nordic, Indian, East Asian, Jewish, Japanese, Arab).
Do you plan to detect more demographic data?
Yes. We are working on additional indicators and plan to gradually introduce them in mid-term. However we can’t announce specific list of indicators and ETA right now.
Will there be integrations with popular CRMs?
Yes. We are currently collecting feedback to evaluate what integrations should be implemented first. Integrations will enable you to demografy your audience without leaving software you used to work with. You may participate in a survey to add your CRM or other software on the list by visiting your dashboard. The most popular ones will be implemented first.
Do you work with non-US lists?
Yes. However the technology currently works best with US lists. Race and partially age may not work properly with non-US lists while other indicators (Gender, Hispanic origin, Ethnicity) should work well with all lists.
Why are exported data and analytics not consistent?
You may have, for example, 15 people out of 100 detected as Hispanics but see 16.5% of Hispanics in analytics. The reason for difference is when overall analytics is calculated it takes into account additional variables such as number of false positives and true positive classifications to compensate for classification error in calculated segments and provide more accurate representation of aggregated data. That is why aggregated analytics differs from numbers estimated in classification of individual records and generally is more accurate than classification of individual records. The biggest discrepancy you may find is with age data. Age prediction from scarce names is the most challenging. It has average accuracy of around 70% for individual record prediction while overall aggregated analytics has average accuracy of 90-95% therefore age bracket composition differs between two algorithms. Analytics provides you with truer overall demographic profile of your audience while individual record prediction provides you with demographic data for each record.
Why is data different for some similar demographics?
You may see slightly different data for Hispanic Americans in Hispanic origin and Hispanics in ethnicity. Hispanic Americans segment in Hispanic origin is similar to Hispanic segment in ethnicity however algorithms are slightly different resulting in difference in calculation of overall aggregated statistics. However in case of detecting demographics for each individual record Hispanic Americans segment in Hispanic origin is generally correlated with Hispanic segment in ethnicity.

You may also have different data for African Americans (race) and Africans (ethnicity). African American segment in race doesn't equal African segment in ethnicity. African in ethnicity is comprised of people of African ancestry with specific African names while African Americans generally use similar names as White Americans and represent unique North American cultural formation with multiple generations of history living in North America.