Abstract

Computer Graphics 2015: Facial Animation - Phoneme’s be gone

Any serious animator worth their weight in frames has seen Preston Blair’s mouth expressions or heard of using phoneme’s for animating a characters lip sync… In its day this was quite effective way for animators to interrupt down dialogue into something manageable. The matter is hand drawn animation has never needed to been curious about recreating perfectly believable lip-sync in any case the start line of traditional hand drawn animation is already several steps faraway from realism. This thinking however has carried over into CG animation during a few ways… Often character rigs will have predefined shapes for a personality which rightful so are often art directed which is usually a desired trait especially if there's an outsized animation team or a selected thing a personality is understood for. However these confine you thereto shape… and make more work for riggers an modelers. Animators also loose a touch of control by the character of this technique. This technique is additionally used often in games to automate facial animation since they often have tons more dialogue to deal with than most feature films….However it produces over chattery results hurting the visuals and even kicking the player out of their suspension of disbelief. i'm proposing a special method which now that CG offers us the power for better or worse to infinitely tweak our animation to realize the foremost subtle of motion. this is often a way I’ve developed over my 16+ year animating characters and creatures who needed to talk dialogue and it involves a deeper understanding how humans speak, what our mouths are capable of doing muscular wise, and the way we perceive what someone is saying during a sight . It also takes some burden off the modelers and riggers, and simplifies controls for animators while increasing the control it affords them. I didn’t invent this, nature did…I’ve just refined how i feel about it and distilled it down into an outline that I’ve never heard explained this manner . My students are very receptive to the present approach and sometimes find it takes the mystery out of effective lip-sync making it easier and faster to supply than they thought. Performance and LipSync are my favorite things to figure on as an animator. Anyone who has ever been during a professional production situation realizes that real-world coding lately requires a broad area of experience. When this expertise is lacking, developers got to be humble enough to seem things up and switch to people around them who are experienced therein particular area. As I still explore areas of graphics technology, I even have attempted to document the research and resources I even have utilized in creating projects for my company. My research demands change from month to month counting on what's needed at the time. This month, I even have the necessity to develop some facial animation techniques, particularly lip synchronization. This suggests i want to shelve my physics research for a touch and obtain another work done. I hope to urge back to moments of inertia, and such, real soon. Facial animation in games has built abreast of this tradition. Chiefly, this has been achieved through cut-scene movies animated using many of an equivalent methods. Games like Full Throttle and therefore the Curse of Monkey Island used facial animation for his or her 2D cartoon characters within the same way that the Disney animators would have. More recently, games have begun to incorporate some facial animation in real-time 3D projects. Tomb Raider has had scenes during which the 3D characters pantomime the dialog, but the face isn't actually animated. Grim Fandango uses texture animation and mesh animation for a basic level of facial animation. Even console titles like Banjo Kazooie are experimenting with real-time “lip-flap” without even having a dialog track. How do I leverage this tradition into my very own project? This paper presents a completely unique data-driven animation system for expressive facial animation synthesis and editing. Given novel phoneme-aligned speech input and its emotion modifiers (specifications), this technique automatically generates expressive facial animation by concatenating captured motion data while animators establish con- straints and goals. A constrained dynamic programming algorithm is employed to look for best-matched captured motion nodes by minimizing a price function. Users optionally specify "hard constraints" (motion-node constraints for expressing phoneme utterances) and "soft constraints" (emotion modifiers) to guide the search process. Users also can edit the processed facial motion node database by inserting and deleting motion nodes via a completely unique phoneme-Isomap interface. Novel facial animation synthesis experiments and objective trajectory comparisons between synthesized facial motion and captured motion demonstrate that this technique is effective for producing realistic expressive facial animations. In particular, state-of-the-art data-driven speech animation approaches have achieved significant successes in terms of animation quality. One among the most reasons is that these techniques heavily exploit the prior knowledge encoded in precollected training facial motion data sets, by concatenating presegmented motion samples (e.g., triphone or syllable-based motion subsequences) or learning facial motion statistical models.
Author(s): David M Breaux

Abstract | PDF

Share This Article