Statistics you are looking for have not been collected, are not in a format usable for you, or they are not easily located.
Most organizations and agencies collect statistics solely to support their objectives which do not always coincide with the needs of the searcher. So the statistics one is searching for may not be collected by any organization. At other times the data you are looking has not been analyzed in the manner in which you need. You may need to obtain the original data and perform your own analysis.*** Remember that data collection is done to meet organizational needs which may not reflect your needs!
Many entities publish statistics on-line. However the particular number, answer or data you need may not be retrievable through search engines. Data contained in Web sites is often hidden because search engines do not search the entire contents of Web pages.
Statistics collected by government agencies and nonprofit organizations are more likely to be publicly available than those collected by for profit entities. However not all government statistics are mandated to be made public or provided free of charge. Private organizations often charge for obtaining copies of their findings. **
Presently subject access through most indexing sources (as PubMed and CINAHL) is still developing. So many statistics are thus challenging to locate because of inadequate subject headings. This is especially true of federal government resources. When possible, see if the reporting agency has the statistics you need. If not, then you will need to search the journal literature in a bibliographic database like PubMed or Cancer.gov. Be sure to read the entire retrieved articles for any and all contained statistical information. Some topics as disability statistics are challenging to find partly because the terms (as disability) are open to a wide range of definitions. Again, lack of good controlled vocabularies hinder data location.
Also note that data collection coverage by sources do not always go back as far as one would wish. For example, US government agencies have only been mandated to collect certain statistics since 1956***. Also agencies and organizations are decentralized. They may vary in how they collect, describe, and report their findings over time. ***
It may take several years for an entity to collect, analyze and publish statistical information on a topic or group of topics. This is especially true when large populations are involved, as the US Census. The quality of statistics varies among organizations and agencies. Factors include how the data was collected and how the data was analyzed. **
*Introduction to Reference Sources in the Health Sciences, 5th ed. Compiled and edited by Jeffrey T. Huber, Jo Anne Boorkman, and Jean Blackwell. Neal-Schuman: New York. 2008.
*Finding Medical/Health Care Resources Online. Finding medical / health care statistics online HLWikiInternational. Last modified on 26 February 2016, at 14:34
Please do not hesitate to contact a Mulford or Carlson Reference Librarian with any challenging research question (whether or not they are statistic related).
Here is a strategy that may be useful, including those times when a librarian is temporarily unavailable (as nights/weekends).
Introduction to Reference Sources in the Health Sciences, 5th ed. Compiled and edited by Jeffrey T. Huber, Jo Anne Boorkman, and Jean Blackwell. Neal-Schuman: New York. 2008.
Finding and Using Health Statistics, US National Library of Medicine, http://www.nlm.nih.gov/nichsr/usestats/index.htm
(Accessed 1 December 2009)
Blog item by Micah Altman (2016) at https://drmaltman.wordpress.com/2016/03/18/why-search-is-not-a-solved-by-google-problem-and-why-universities-should-care-ophir-frieders-talk/
Includes 74 item slide set.
Many consider “searching” a solved problem, and for digital text processing, this belief is factually based. The problem is that many “real world” search applications involve “complex documents”, and such applications are far from solved. Complex documents, or less formally, “real world documents”, comprise of a mixture of images, text, signatures, tables, etc., and are often available only in scanned hardcopy formats. Some of these documents are corrupted. Some of these documents, particularly of historical nature, contain multiple languages. Accurate search systems for such document collections are currently unavailable.
The talk discussed three projects. The first project involved developing methods to search collections of complex digitized documents which varied in format, length, genre, and digitization quality; contained diverse fonts, graphical elements, and handwritten annotations; and were subject to errors due to document deterioration and from the digitization process. A second project involved developing methods to enable searchers who arrive with sparse, fragmentary, error-ridden clues about places and people to successfully find relevant connected information in the Archives Section of the United States Holocaust Memorial Museum. A third project involved monitoring Twitter for public health events without relying on a prespecified hypothesis.
Across these projects, Frieder raised a number of themes:
Some areas of science, such as the social sciences, increasingly rely on proprietary collections of big data from commercial sources. Much of this growing evidence base is currently accessible only through proprietary API’s. To meet the heightened requirements for transparency and reproducibility, stewards are needed for these data who can ensure nondiscriminatory long-term research access.
More generally, it is increasingly well recognized that the evidence base of science not only includes published articles, community datasets (and benchmarks); but also may extends to scientific software, replication data, workflows, and even electronic lab notebooks