The
U.S. Constitution in Article 1, Section 3 mandates an official counting
of the population every 10 years. The first Census was completed in 1790.
While the official purpose of the Constitution's mandate of a population
count is to reapportion congressional districts, the Census also provides
a statistical history of the nation and its people and is an economic asset
and tool of inestimable value. The Census shows not only where people live
at the time of the count, but also their educational levels, income, and
other vital data.
The U.S. Census
Bureau has historically led in the use of information technology to collect,
process, analyze, manipulate, and publish data. It is a prestigious bellwether
of where public sector information is going. It has developed methods to
manage large data sets and large data collection projects. Counting the
population has been and continues to be a formidable task. The first Census
in 1790 took 18 months to complete1.
The Congress assigned responsibility for taking the Census to U.S. marshals
from 1790 to 1840. From 1850 to 1900, the Department of the Interior was
responsible for the enumeration2.
In 1902, the Census Bureau was established as part of the Department of
the Interior. In 1903, it was transferred to the Department of Commerce.
The 1790 Census
revealed a population of 3.9 million people living in an area of 891,364
square miles. By 2000 the area of the U.S. had expanded to 3.6 million
square miles and the population had grown to 281 million people. The number
of questions on the Census has grown over the years to include more and
more socio-economic data. The 2000 Census used short forms for most people
and longer forms with detailed demographics for a sample of the population.
In addition to
contributions in the development of information technology, the Census
Bureau's statisticians have formulated better population sampling techniques.
However, Congress has mandated that the population count for reapportioning
the House of Representatives must be done person by person, with no sampling.
All other data collection relies on sampling the population.
Asset Value
Collecting, processing,
analyzing, and distributing Census data cost millions of dollars. The return
on this investment in terms of value to users and the economy is many times
the cost. Wide availability of the data permits forecasting of population-related
developments with greater accuracy. It supports decision making and planning
based on facts instead of guesses and estimates. Industry, business, government,
education, and other organizations use the Census of Population for many
purposes, ranging from forecasting the needs for classrooms for our schools
to locating fast food outlets. Other Census data products include data
about foreign trade, manufacturing, transportation, and many other areas.
The data products from the Census Bureau are especially valuable in making
business decisions and planning industrial and urban development.
Data collected
by the Census Bureau answer thousands of questions and provide the basis
for planning growth and development. How many high school teachers will
we need in 2010? How many manufacturers and what products are produced
in Gwinnet County, Georgia? What is the average number of years of school
completed for different geographic areas? Different groups in the population?
Where are the highest income communities? Where are the poorest communities?
Value-Added Publishers
Many third-party
publishers distribute Census data into the marketplace. Most businesses
and individuals need only subsets of any of the Census data compilations.
Subsets tailored to the needs of specific groups or businesses are easier
to use and satisfy the needs of users more quickly than having users try
to find the data themselves. By selecting appropriate subsets and analyzing
and mapping data, third-party publishers add value for their customers.
Having the right set of data in the right format facilitates decision-making
and planning.
The sale of value-added
Census data by third-party publishers represents a public/private partnership
that pays off for everyone involved. While taxpayers pay for the Census,
taxpayer funds do not have to be used to satisfy special needs. Publishers
can serve these special markets and make a profit.
The Role of Librarians
In order to learn
more about access and distribution of the Census, I sent surveys to two
librarians associated with the Government Documents Round Table (GODORT)
of the American Library Association. While these librarians could not speak
for GODORT, they presented some interesting perspectives.
How should U.S.
Census data be distributed?
Both librarians
answered, at least in part, paper. Acid-free paper for preservation makes
sense; however, it is not necessary to store hundreds of volumes of numeric
data in every depository library. Selected depository libraries or trusted
third parties can fulfill the need to protect and preserve the data. Given
that more than half the households in the U.S. have access to the Internet,
the need for paper seems odd except for preservation.
The librarians
indicated that a paper compilation of data for a particular community would
be useful, because people often ask to check a quick fact. Looking up an
answer to simple questions may be faster in paper than on the computer.
Other desired forms of distribution included the Internet, DVD-ROMs, and
CD-ROMs. They suggested that depository libraries have all the raw data
on DVDs. One librarian commented that the Census Bureau's own Web site
needed sufficient bandwidth and computer power to sustain service at peak
times.
How much do
libraries rely on third-party publishers of value added Census data?
One librarian said,
"Not much." The other librarian indicated some reliance on third-party
vendors. Both respondents are employed in academic libraries where the
faculty and researchers often prefer to do their own data manipulation.
Corporate libraries and information centers rely heavily on third-party
products because specific data is often needed quickly. These libraries
usually do not need the entire compilation of data. Public libraries also
may rely on third-party products depending on the composition of their
communities and the needs of local businesses for Census data.
How involved
are libraries with state data centers?
Again, there was
a difference. One librarian said most depositories are not involved. The
other librarian said it varied from state to state and that some university
libraries are associated with the data centers. This situation seems odd
because the state data centers often can provide useful help to librarians
and data users.
Since Census
data are numeric and many librarians are not trained to manipulate numeric
data, how much help can librarians offer Census data users?
My own experience
as a data user is mixed and mostly negative. The limitations arise because
librarians are not trained to ask the right questions related to numeric
data, especially economic and social data. Most librarians have been trained
to deal with text rather than statistics. Younger librarians may have had
more mathematics and statistics courses and be more attuned to describing
and using numeric data.
One respondent
librarian stated, "If there is adequate online documentation, and there
often is, librarians do not necessarily need to be trained to manipulate
this data, but to know where a researcher can find online assistance."
This librarian also pointed out that skill with spreadsheets could satisfy
most needs. The other librarian agreed with the idea of spreadsheet skills
and added, "The math is basic. If librarians don't know, they should be
proactive to obtain the basic skills."
How should Census
data from 1790 forward and into the future be digitized, archived, preserved,
and accessed?
Both librarians
pointed out the work of the Inter-University Consortium for Political and
Social Research [http://www.icpsr.umich.edu]
working with the University of Virginia [http://fisher.lib.virginia.edu/Census]
in bringing about access to population Census data from 1790 to 1960. One
librarian described the ideal as interactive access to enable building
of tables across censuses. They also pointed out that Census is converting
some files to pdf format. Another ideal was for Census to do the work;
however, our librarians recognized the difficulty of obtaining funding
for such a big job.
Access to all Census
data would allow the study of trends and changes in population, business,
manufacturing, foreign trade, transportation, and other aspects of society
and the economy documented by the data. People studying earlier censuses
now need to build their own tables and data sets for study. They often
have significant challenges in access and construction of consistent data
sets. The changes in geographic boundaries in metropolitan areas and the
inevitable errors in earlier Censuses create obstacles and the need to
temper results.
How does GODORT
work with the Bureau of the Census?
"GODORT has proven
a good forum for pooling needs and advice from librarians and communicating
these with the Census Bureau." GODORT indeed provides a useful forum, satisfying
the need and desire to improve Census products, to provide incentives for
listening to end-users, and to transmit their difficulties, experiences,
and suggestions to the Census Bureau. While the Census Bureau deals directly
with different users and user groups, librarians can contribute ideas and
suggestions from their user communities.
What are GODORT's
most important issues regarding Census data?
Our librarian respondents
spoke for themselves, not for GODORT. They indicated that the chief issues
are permanent access to electronic materials, training, preservation of
data, and migration to new formats.
What role to
do you see for the depository libraries in maintaining and/or distributing
digital federal data?
The librarians
had different views. One librarian indicated that depositories should be
the permanent repositories for print, DVD-ROM, and CD-ROM formats. As space
continues to escalate in value in our cities and suburbs, it is not clear
how long depository libraries can justify storage in prime space and maintain
and preserve large print collections. As more and more public sector information
becomes available on the Internet, the need for all depositories to store
everything declines. Selected depositories and trusted third-party sites
outside urban areas may be needed in the future.
Both librarians
saw the need for profiles of their local communities in print. One librarian
suggested "cooperative projects with local governments to develop historic
data sets for their own communities." Using Census data, old and new, provides
an opportunity to produce data profiles of value and importance to local
communities.
The need to preserve
the data is clear. The question is how to preserve and how to make the
data accessible in usable form in perpetuity. There are no easy or inexpensive
answers. While the use of data on paper is limited to the quick lookup,
acid-free paper is a reasonable storage medium for the long term. The uncertainty
of the world situation clearly calls for preservation in several secure
sites and in all formats. The best storage media for the long run are acid
free paper and silver halide microfilm.
Census Bureau Survey
In addition to
soliciting views from librarians, we sent a survey to the Census Bureau.
Several Census staffers collaborated to complete the survey
Was the distribution
of the 2000 Census completely digital?
The data were distributed
primarily through the Internet as well as CD-ROM, DVD-ROM, and paper. Census
2000 maps can be accessed online in PDF format. Maps for the 1990 Census
had to be purchased on paper. More information on Census 2000 products
is available at [http://www.census.gov/population/www/censusdata/c2kproducts.html].
The number of printed pages we distributed was about 50,000 pages, down
from 450,000 in 1990.
What are the
main ways Census data are accessed by users?
"The Census Web
site [http://www.census.gov]
receives several million hits per day. Information is posted to the Web
site as soon as it becomes available. "Users who need a few number for
a few geographic areas, a few data tables, or a thematic map can go to
the American FactFinder, a data retrieval tool on the (Census) Web site."
One librarian indicated that American Fact-Finder was a "viable product"
for retrieving data. Other means of access include FTP, DVD-ROM, and CD-ROM
How does the
Bureau interact with value-added distributors and publishers of Census
data?
The Bureau realizes
that there are "customers who may need the information in different formats
or with additional functionality. There we encourage our dissemination
partners and others to tailor information to local needs, combine it with
data from other sources, analyze it, or otherwise add value to it."
What role does
the Bureau see for GPO and Federal Depository Libraries?
The Bureau works
closely with GPO and the depository libraries. GPO can obtain copies of
Census information for distribution to the depository libraries.
What role does
the Bureau see for librarians in aiding Census users?
"Many librarians
are knowledgeable about the census data, access tools, maps, and census
terminology and can guide users to the information they need." The Bureau
also noted that depository libraries provide data from past censuses through
their
collections.
When will the
complete file of each Census from 1790 forward be made available online?
"Once a census
is taken, the Census Bureau provides a record of the responses to the National
Archives, where they are kept confidential for a period of 72 years. Genealogists
have been anxiously awaiting the release of the 1930 Census records, which
were recently made available by the Archives. The Archives has not digitized
files from previous censuses for online access, although several private
organizations are doing so."
What enhancements
are being planned for users?
Over the next several
months enhancements to American Fact Finder are planned for implementation.
These include a new main page to help guide users; addition of FIPS codes
with the Geographic Comparison Table for U.S. by state, by county and county
by county subdivision by place; "zoom by latitude and longitude to the
Thematic Maps, Reference Maps and the geographic selection by map." In
addition, the Bureau has been testing "an Advance Query function that will
allow users to develop custom tabulations from the basic records with confidentiality
restrictions and safeguards."
What is the
Bureau's commitment to maintain the Census data and making it available
in perpetuity in usable form?
"Even though the
Bureau does not anticipate removing any of the 1990 Census or 2000 Census
data from its site, we are working with the Government Printing Office
and the Federal Depository Library Program to provide additional long-term
access through a depository library."
Use of Information Technology
From 1790 to 1880,
"Census data were tabulated by clerks who made tally marks or added columns
of figures with a pen or pencil3..
"As the nation and its population grew, new methods were needed to tabulate
and analyze the vast amounts of data collected by Census takers. In 1880,
the Census bureau first used "a tabulating machine, a wooden box in which
a roll of paper was threaded past an opening where a clerk marked the tallies
in various columns and then added up the marks when the roll was full."
This operation made tabulating the data twice as fast4.
In 1890, Herman
Hollerith assisted the Bureau with punch cards. Data were recorded by punching
holes in the cards for the data elements. The cards were run through equipment
that counted the holes. The Hollerith cards were developed from cards used
by Joseph Maria Jacquard to control pattern weaving on looms.
The 1950 Census
of Population used a Univac computer for tabulation of data. The Univac
tabulated 4,000 items per minute. Punch cards were no longer suitable for
recording data. For the 1960 Census, the Census Bureau and the National
Bureau of Standards developed FOSDIC (film optical sensing device for input
to computers). FOSDIC was used until the 2000 Census. Filling in dots opposite
the appropriate answers completes the survey. The survey was photographed
onto microfilm. FOSDIC read the dots and transferred the data to tape for
computer input5.
The 1960 Census
was the first to use the mail for collection of data. People were asked
to complete the survey forms and hold them until the Census taker appeared
to review and retrieve the form. Now the Census of Population is completed
mostly by mailing forms to households and having them returned via the
mail.
The Census Bureau
began making data available to the public early in the 20th century. From
the 1920s to the 1950s, data were distributed on punch cards. In the 1960s,
the Bureau began distribution on tape. By the 1980s, it became possible
to distribute data on diskettes. Later the Bureau switched to CD-ROMs and
the Internet.
The Census Bureau
has led in the development and use of technology for the collection and
distribution of large data sets. The Bureau also has been a pioneer in
the distribution of data on maps to illustrate demographic data for particular
geographic areas or specific data items.
Future
Work on the Census
of 2010 is underway. The 2010 Census forms will be mailed to households
with addresses that receive mail. The forms will be returned by mail, scanned,
and recorded. Households that do not return the forms or that do not receive
mail will be visited by a Census taker who will record data about the residents
using a hand-held device. This part of the operation will be paperless,
more efficient, and perhaps more accurate6.
The Bureau has
not indicated when households will complete their forms via the Internet.
Since people can now file their income tax returns electronically, it seems
reasonable to assume that in the near future people will be able to complete
their Census forms via the Internet. Security may be a primary obstacle.
Secure systems are essential to preserve the privacy and integrity of the
Census process.
Preservation and
access in perpetuity are difficult problems. The Census Bureau and librarians
recognize the challenges and are committed to finding solutions. The 1990
and 2000 Censuses are in digital form and can be preserved and archived.
The cost of converting past censuses to digital formats may be too expensive
for the Bureau, the GPO, or any single agency. Digitization may have to
come from the private sector or some cooperative arrangement of government
and nongovernment organizations. Preservation of the statistical documentation
of our history is essential. The loss of records and archives on September
11th alone illustrated the need for preservation and archiving.
Footnotes
1. Census History
and 20th Century Firsts, http://infoplease.com/spot/Census2.html.
2. http://fisher.lib.virginia.edu/Census/background.
3. U.S. Census
Bureau, Factfinder for the Nation, Washington, DC, May 2000, p.
10.
4. Ibid.
5. Ibid.,
p. 11
6. Bob, Brewin,
"U.S. Census Bureau Plans for the First Paperless Tally in 2010," Computerworld,
March 18, 2002, p. 5.
|
|
|