Show simple item record

dc.contributor.authorLemonari, Marilenaen_US
dc.contributor.authorAndreou, Nefelien_US
dc.contributor.authorPelechano, Nuriaen_US
dc.contributor.authorCharalambous, Panayiotisen_US
dc.contributor.authorChrysanthou, Yiorgosen_US
dc.contributor.editorPelechano, Nuriaen_US
dc.contributor.editorPettré, Julienen_US
dc.date.accessioned2024-04-16T14:59:34Z
dc.date.available2024-04-16T14:59:34Z
dc.date.issued2024
dc.identifier.isbn978-3-03868-241-7
dc.identifier.urihttps://doi.org/10.2312/cl.20241049
dc.identifier.urihttps://diglib.eg.org:443/handle/10.2312/cl20241049
dc.description.abstractCreating believable virtual crowds, controllable by high-level prompts, is essential to creators for trading-off authoring freedom and simulation quality. The flexibility and familiarity of natural language in particular, motivates the use of text to guide the generation process. Capturing the essence of textually described crowd movements in the form of meaningful and usable parameters, is challenging due to the lack of paired ground truth data, and inherent ambiguity between the two modalities. In this work, we leverage a pre-trained Large Language Model (LLM) to create pseudo-pairs of text and behaviour labels. We train a variational auto-encoder (VAE) on the synthetic dataset, constraining the latent space into interpretable behaviour parameters by incorporating a latent label loss. To showcase our model's capabilities, we deploy a survey where humans provide textual descriptions of real crowd datasets. We demonstrate that our model is able to parameterise unseen sentences and produce novel behaviours, capturing the essence of the given sentence; our behaviour space is compatible with simulator parameters, enabling the generation of plausible crowds (text-to-crowds). Also, we conduct feasibility experiments exhibiting the potential of the output text embeddings in the premise of full sentence generation from a behaviour profile.en_US
dc.publisherThe Eurographics Associationen_US
dc.rightsAttribution 4.0 International License
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCCS Concepts: Computing methodologies → Neural networks; Natural language processing; Computer graphics
dc.subjectComputing methodologies → Neural networks
dc.subjectNatural language processing
dc.subjectComputer graphics
dc.titleLexiCrowd: A Learning Paradigm towards Text to Behaviour Parameters for Crowdsen_US
dc.description.seriesinformationCLIPE 2024 - Creating Lively Interactive Populated Environments
dc.description.sectionheadersMocap and Authoring Virtual Humans
dc.identifier.doi10.2312/cl.20241049
dc.identifier.pages9 pages


Files in this item

Thumbnail

This item appears in the following Collection(s)

  • CLIPE 2024
    ISBN 978-3-03868-241-7 | co-located with EG2024

Show simple item record

Attribution 4.0 International License
Except where otherwise noted, this item's license is described as Attribution 4.0 International License