dc.contributor.author | Lemonari, Marilena | en_US |
dc.contributor.author | Andreou, Nefeli | en_US |
dc.contributor.author | Pelechano, Nuria | en_US |
dc.contributor.author | Charalambous, Panayiotis | en_US |
dc.contributor.author | Chrysanthou, Yiorgos | en_US |
dc.contributor.editor | Pelechano, Nuria | en_US |
dc.contributor.editor | Pettré, Julien | en_US |
dc.date.accessioned | 2024-04-16T14:59:34Z | |
dc.date.available | 2024-04-16T14:59:34Z | |
dc.date.issued | 2024 | |
dc.identifier.isbn | 978-3-03868-241-7 | |
dc.identifier.uri | https://doi.org/10.2312/cl.20241049 | |
dc.identifier.uri | https://diglib.eg.org:443/handle/10.2312/cl20241049 | |
dc.description.abstract | Creating believable virtual crowds, controllable by high-level prompts, is essential to creators for trading-off authoring freedom and simulation quality. The flexibility and familiarity of natural language in particular, motivates the use of text to guide the generation process. Capturing the essence of textually described crowd movements in the form of meaningful and usable parameters, is challenging due to the lack of paired ground truth data, and inherent ambiguity between the two modalities. In this work, we leverage a pre-trained Large Language Model (LLM) to create pseudo-pairs of text and behaviour labels. We train a variational auto-encoder (VAE) on the synthetic dataset, constraining the latent space into interpretable behaviour parameters by incorporating a latent label loss. To showcase our model's capabilities, we deploy a survey where humans provide textual descriptions of real crowd datasets. We demonstrate that our model is able to parameterise unseen sentences and produce novel behaviours, capturing the essence of the given sentence; our behaviour space is compatible with simulator parameters, enabling the generation of plausible crowds (text-to-crowds). Also, we conduct feasibility experiments exhibiting the potential of the output text embeddings in the premise of full sentence generation from a behaviour profile. | en_US |
dc.publisher | The Eurographics Association | en_US |
dc.rights | Attribution 4.0 International License | |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | |
dc.subject | CCS Concepts: Computing methodologies → Neural networks; Natural language processing; Computer graphics | |
dc.subject | Computing methodologies → Neural networks | |
dc.subject | Natural language processing | |
dc.subject | Computer graphics | |
dc.title | LexiCrowd: A Learning Paradigm towards Text to Behaviour Parameters for Crowds | en_US |
dc.description.seriesinformation | CLIPE 2024 - Creating Lively Interactive Populated Environments | |
dc.description.sectionheaders | Mocap and Authoring Virtual Humans | |
dc.identifier.doi | 10.2312/cl.20241049 | |
dc.identifier.pages | 9 pages | |