SMITHSONIAN AMERICAN WOMEN'S HISTORY MUSEUM
Discoverability Lab Offers New Look at Historical Data and Machine Learning
Explore early experiments to increase the discoverability of women’s history at the Smithsonian.
:focal(1920x1088:1921x1089)/https://tf-cmsv2-smithsonianmag-media.s3.amazonaws.com/filer_public/b6/6c/b66c7ae2-409d-4162-bcee-ee5cb95e2e38/summer_hamilton_at_archives.jpg)
Over the last century, and especially since the 1960s, historians have increased the visibility of women’s lives and work in American history so that we can see the past more clearly. They’ve accomplished this by experimenting with new research and interpretive methods that expand the way we interact with historical records and museum objects and reveal new insights into women’s history.
For example, beginning in the 1960s, Margaret Rossiter mined archival records, such as government reports and institutional histories, to prove that women had contributed to scientific fields throughout American history. Later, Laurel Thatcher Ulrich opened new interpretive possibilities for historians when she used a long-overlooked midwife’s diary from the 1700s to unlock new understandings of community life in a Maine seaport. More recently, Tiya Miles inspired the field by using a cotton sack, a family keepsake created by an enslaved mother for her daughter, to tell a sweeping story about American slavery and a Black family.
These rich interpretive methods alongside museum and archives’ efforts in the 2000s to digitize, transcribe, and share historical collections have resulted in new opportunities for collaboration amongst historical researchers and data scientists. While historians have long used quantitative methods in their work, more recently, they have been experimenting with computational methods to further expand what we can learn from historical records. Text mining, social network analysis, and data visualization are just a few of the applications that have emerged.
These tools and technologies allow historians, archivists, and museum professionals to build on earlier advances in historical research. The Smithsonian American Women’s History Museum’s Discoverability Lab is a launchpad for accelerating these new approaches and sharing exciting discoveries about women’s history in America with the public.
Using technology to increase and diffuse knowledge is not new to the Smithsonian. The Discoverability Lab builds on previous efforts. Below are a few early experiments with historical data and machine learning that have inspired the Discoverability Lab’s work.
The Funk List: Increasing Access to Histories of Women in Science with a Structured Dataset
/https://tf-cmsv2-smithsonianmag-media.s3.amazonaws.com/filer_public/db/8c/db8c7cdf-18f6-4466-a83a-12f67aa09770/sidedoor_funk_list.png)
In 2018, Smithsonian digital strategist Effie Kapsalis worked with a team at the Smithsonian Libraries and Archives and the Smithsonian National Museum of Natural History to create a structured dataset (a body of data in a standardized format that is machine readable) documenting the history of women in science at the Smithsonian. First, historians mined archival records, such as phone books, annual reports, and curatorial records, to identify women in science in Smithsonian history and to document how their careers changed over time. Then, the information collected in that research was entered into the structured dataset containing fields such as full names, professional titles held overtime, educational background, and publications. Analysis of these data resulted in new insights about women’s multifaceted roles in Smithsonian science, and, even more importantly, this work set the stage for experiments with historical research and data science. Eventually named “The Funk List” in honor of Dr. Vicki Funk, a botanist and founding member of the team, the work grew to include experiments with metadata, open access collections, and open knowledge work.
Surfacing Women in Smithsonian History: Experimenting with Open Access Collections Metadata
/https://tf-cmsv2-smithsonianmag-media.s3.amazonaws.com/filer_public/db/b3/dbb3535f-3827-47b4-a5d8-2a3fd25ba199/google_arts_and_culture_image_cluster_map.jpg)
In 2020, Smithsonian Open Access released 2.8 million 2D and 3D images and 173 years of staff-created data into the public domain as Creative Commons Zero, meaning the images are free for everyone to download, share, and reuse. In celebration, researchers at the Smithsonian partnered with Google Arts & Culture to experiment with using machine learning to surface stories about women’s history in its collections metadata. At the Smithsonian, collections metadata depicts, describes, and structures information about the Institution’s collections. The team created a network view of data about women in science in Smithsonian history, experimented with extracting women’s names from the data, and used a clustering algorithm on images in Smithsonian collections data to reveal how collections might be visually grouped by topic. All these experiments provided new insights about how the Smithsonian can make women’s history more discoverable at scale.
What’s in a Name: Improving the Accuracy of Historical Data through Interdisciplinary Collaboration

In 2019, historians and data scientists worked together at the Smithsonian to experiment with new approaches to uncovering information about women’s history within its specimen collections, objects, and institutional records. In the process, intern Tiana Curry surfaced very particular challenges to uncovering and sharing women’s stories that shed light on how historical practices of including honorary titles, especially for married women, along with the digitization and transcription process, could result in the erasure of their contributions. Specifically, the use of “Ms./Miss/Mrs. Last Name” led to some scientific contributions being misattributed to a woman’s husband, brother, or father rather than herself. The project, “What’s in a name?” highlights the story of artist and botanist Mary Vaux Walcott as an example of this particular challenge.
All of these experiments demonstrate how researchers can harness over 100 years of research innovations in women’s history to ensure that women are visible in American history today. The Discoverability Lab is a place where these research innovations can be shared and explored. At the Smithsonian, and across the world’s cultural heritage institutions, there are important stories of how women have shaped the world waiting to be unlocked.
Further Reading
- “Because of Her Story: The Funk List,” 2019, Smithsonian American Women’s History Museum[HL2] .
- Surfacing Women in Smithsonian History, Google Arts & Culture Experiments, 2022.
- “The Funk List,” Sidedoor Season 9, episode 7, 2023.
- “Using Data Science to Uncover the Work of Women in Science,” 2022, Smithsonian American Women’s History Museum.
- “What’s in a name? Uncovering stories of women in science,” by Rebecca Dikow and Megan Glenn, 2020, Smithsonian Data Science Lab.
- “Women Scientists Were Written Out of History. It’s Margaret Rossiter’s Lifelong Mission to Fix That,” by Susan Dominus, 2019, Smithsonian Magazine.
- New Partnership Illuminates Hidden Record of NASA’s Human Computers, Smithsonian American Women's History Museum
- The Challenge of Metadata in Uncovering Women’s History, Smithsonian American Women's History Museum