Research seminar

Research seminar in Microdata Analysis

Ross May presents his work: A multi-agent reinforcement learning simulator for investigating and optimising P2P prosumer energy networks.

This event has already passed

Date: , kl 10:30 - 12:00
Location: 311 and Zoom
Locale: 311

The current power grid infrastructure was not designed with climate change in mind, and, therefore, its stability, especially at peak demand periods, has been compromised. Furthermore, in light of the current UN Intergovernmental Panel on Climate Change (IPCC) reports concerning global warming and the goal of the 2015 Paris climate agreement to reduce carbon emissions and constrain global temperature increase to within 1.5-2°C above pre-industrial levels, urgent sociotechnical measures need to be taken. Smart microgrids combined with renewable energy technology have been put forward as one solution out of many to help mitigate global warming and grid instability. Within the context of a smart microgrid, well-managed demand-side flexibility is crucial for efficiently utilising on-site solar energy. To this end, a well-designed dynamic pricing mechanism can organise the actors within such a system to enable the efficient trade of on-site energy, therefore contributing to the decarbonisation and grid security goals alluded to above. However, designing such a mechanism in an economic setting as complex and dynamic as the one above often leads to computationally intractable solutions, that is, ones that are very difficult or impossible to compute exactly. To overcome this problem, in this work, we extend Foundation, an open-source economic simulation framework built by Salesforce, to incorporate a community of prosumers with heterogeneous demand/supply profiles and battery storage. Using data based on a community in Sweden, we have carried out data-driven simulations using reinforcement learning (RL) to investigate the complex dynamics of this peer-to-peer (P2P) community and to learn an optimal pricing mechanism. With the minimisation of electricity cost as the primary objective in this study, our results show that RL can learn a dynamic price signal that achieves a lower total electricity cost to the P2P community compared with a baseline fixed-price signal. We have also identified emergent social-economic behaviours. For example, prosumers with more supply than demand prefer to sell energy than store it, and, conversely, those with more demand than supply prefer storing energy than selling it. Lastly, under RL a higher community self-sufficiency score has been observed. Practitioners within the energy field can use our proposed approach to aid in designing P2P energy trading markets within smart microgrids.



For more information, contact
Senior Lecturer data och informationshantering
Last reviewed: