Statistics was on the menu at this month’s Davis Science Café, a modern-day version of the intellectual salons of the past where science and the community meet face-to-face for a conversation.
Fushing Hsieh, a professor in the Department of Statistics, discussed the wide-ranging nature of his research in a conversation paradoxically titled “Statistical Analysis is Unscientific.”
While somewhat tongue-in-cheek, the title of Hsieh’s talk was meant to emphasize the potential overreliance of humans on artificial intelligence systems like ChatGPT to untangle and find meaning in large swaths of data.
“I want to emphasize that data analysis is a creative task, not just fitting models one after the other,” Hsieh said following the presentation.
In a culture continuously dominated by and relying on artificial intelligence systems, Hsieh urged the audience to consider the necessity of the human mind to design statistical analysis models and critically evaluate findings to ensure they’re authentic.
“Behind the model, there are many assumptions,” Hsieh said to the audience gathered at G Street Wunderbar in Davis. “Sometimes you have to think a little bit about whether this piece of information is trustworthy or not.”
The assumptions of the model can influence a ChatGPT calculation, he said. “It might contaminate your data analysis, so that’s why I use this sort of title to somehow make people a little more aware of the issue.”
Hsieh’s discussion with the audience recounted his trajectory in academia and highlighted a handful of his research projects, from analyzing the colors of centuries-old artworks to studying social hierarchy in communities of primates.
Revealing colors lost to time
In a letter to his brother, artist Vincent van Gogh once wrote that “paintings fade like flowers.” At the time of the letter’s writing, van Gogh was concerned about how his paintings aged within his lifetime. What he didn’t know was that over 130 years later, statisticians would still be pondering this problem.
Van Gogh tried to use premium, brilliant colors to combat the effects of aging, Hsieh said. What he didn’t realize is that sometimes those brilliant colors are the ones impacted the most by aging.
In a paper published in Heritage Science, Hsieh and UC Davis colleagues analyzed van Gogh’s Sunflower series in an attempt to illuminate the effects of aging on them.
To recover the original colors of the painting, the team analyzed the red, blue and green components of digital images of the paintings. Numerous combinations of these colors are used to create any color in the visible spectrum.
Specifically, the team collected datapoints representing this information from key regions of interest in the painting, including the flowers, the flowers’ stems and the background. They then compared those areas in the digital image of the painting with photographs or real sunflowers.
Of the red, blue and green colors, the team found the most pronounced shifts occurred at the blue wavelength. They used that information to design an algorithm capable of reversing the effects of aging and recreating how the Sunflower series appeared during the time of van Gogh.
The result: paintings with more yellow-infused, vibrant backgrounds than the originals displayed in the National Gallery in London, the Sompo Japan Museum of Art in Tokyo and the Van Gogh Museum in Amsterdam.
With art being such an evocative experience, Hsieh said that such de-aging work could bring the artist’s original intent to viewers despite the distance of time between the two.
Using statistics to study social hierarchy
Standing before a stage in the main room of G Street Wunderbar, Hsieh shifted the conversation to his work studying social hierarchy in captive rhesus macaques.
In a study appearing in Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, Hsieh and colleagues designed a computer algorithm to create a ranking list corresponding to social hierarchy for a population of roughly 100 monkeys.
The data collected were aggressive interactions between monkeys that resulted in a decisive outcome. While the research produced high-confidence rankings for the top two monkeys in the population, the rest of the social hierarchy was murkier.
According to Hsieh, the findings emphasized the shortcomings of linear rankings.
“Linear ranking is very convenient, but it’s not that scientific,” he said, noting how the two same monkeys can have two aggressive interactions with different end results. Additionally, certain monkeys may not interact with each other at all, leading to a lack of data to inform an algorithm.
To navigate around this problem, Hsieh and colleagues introduced a property called "transitivity" to their ranking algorithm. In essence, the property allowed them to infer the results of aggressive interactions between two monkeys, even if those monkeys had never interacted before.
For example, let’s say “monkey A” beats “monkey B” in an aggressive conflict, and then “monkey B” beats “monkey C” in a separate aggressive conflict. Transitivity allows researchers to infer in a statistically sound way that “monkey A” will beat “monkey C” in a hypothetical conflict.
The workaround enabled the researchers to better rank the social hierarchy in their studied monkey population.
Authentic information in an AI world
Hsieh advocates for scientific data analysis (established statistical models) to be integrated into machine learning and AI systems, which may otherwise be black box systems — fed inputs and producing outputs without any transparency to the model/method used to connect the two.
Such a system would be an AI, but one that Hsieh said stands for an “authentic information” robot. He’s currently working on a book about the concept.
The Davis Science Café is held every second Wednesday of the month. Learn more about the Davis Science Café.
YOU MAY ALSO LIKE THESE STORIES
How Can Seagrasses Help Mitigate Climate Change?
On a triple-digit summer day in Davis, community members found shade and science within the walls of G Street WunderBar at the latest Davis Science Café. Standing at the front of the pub, marine geochemist and oceanographer Tessa Hill showed the audience a picture of a fog-shrouded Tomales Bay and asked them to reflect on their connections to the ocean.
How Statistics Informs Social Science Research with Xiao Hui Tai
Xiao Hui Tai, an assistant professor in the Department of Statistics, specializes in using large-scale, granular sources of data, and statistical and machine learning methods to study problems concerning conflict and the developing world.