In this blog EDRC researcher Michael Fell discusses his recent preprint paper looking at how Large Language Models can be used to simulate survey research, and potentially help make social research in energy more productive.
In 2023, Large Language Models (LLMs) like ChatGPT exploded into the public consciousness, and promised to transform the way we work. Users began searching for ways to capitalise on their uncanny ability to understand and generate human language to increase productivity across a range of fields.
Nowhere is productivity needed more than in efforts to address climate change. Finding ways to move quickly using available resources offers the best chance at avoiding its severest impacts. The potential contributions of LLMs in this area are significant, from helping map environmental damage to supporting better climate communication.
I’ve been investigating ways that LLMs could help improve the productivity of social research in energy. I think of this in terms of the amount of positive impact delivered for research resource committed. Specifically, I was interested in their ability to replicate existing findings from social survey studies. If LLMs could reliably give an indication of things surveys test – like which messages could have the biggest effect on decision making – this opens the door to a range of intriguing possibilities.
For example, taking advantage of the fact that LLMs are cheap and quick to use, researchers could rapidly test many different interventions before deploying only the most promising ones to more resource-intensive human or field trials. It could also be possible to do preparatory research on LLM survey “participants” from populations who may be harder to access directly (e.g. due to their time constraints or lack of digital connectivity). This way the best possible use can be made of subsequent contact with real human participants.
So how does this work?
The approach which I use (I’m not the first) is to generate multiple LLM agents and prompt them to respond to survey questions. As anyone who has asked ChatGPT to write an email in the style of Shakespeare knows, it’s possible to endow an LLM with certain characteristics and respond accordingly. An agent can be given an age, gender, environmental attitudes, different personality characteristics, social relations, emotions, experiences – anything a researcher thinks might be important to the way someone could respond. A thousand agents can easily be created (with characteristics representative of a general population where data are available) – and then deliver a dataset of a thousand survey responses which can be analysed in the usual way.
How successful is it?
So far I’ve used the approach to try to replicate the findings of parts of three existing studies. In one case there was moderate resemblance to the results of the original study. In the other two studies the findings were so similar that the same conclusions would probably be drawn based on the data from the LLM surveys as the real human ones.