WebXR Expression Tracking - Level 1

Unofficial Proposal Draft,

More details about this document
This version:
https://cabanier.github.io/webxr-expression-tracking-1/
Issue Tracking:
GitHub
Editor:
(Meta)

Abstract

The WebXR Expression Tracking module expands the WebXR Device API with the functionality to track expressions.

Status of this document

This WebXR Module is designed as a module to be implemented in addition to WebXR Device API, and is originally included in WebXR Device API which was divided into core and modules.

1. Introduction

This API exposes the expressions of the user’s face and the position of their eyes. This can be used to render a more immersive avatar.

2. Initialization

If an application wants to get expressions during an immersive session, the session MUST be requested with an appropriate feature descriptor. The string "expression-tracking" is introduced by this module as a new valid feature descriptor for face expressions.

The "expression-tracking" feature descriptor should only be granted for an XRSession when its XR device has sensor data to support reporting of expressions.

3. Access to expressions

3.1. Expressions

An expression MUST be one of the XRExpression types. A user agent MAY support a subset of XRExpression types and this subset is allowed to change during an XRSession

Following is the list of expressions and their order:

enum XRExpression {
  "brow_lowerer_left",
  "brow_lowerer_right",
  "cheek_puff_left",
  "cheek_puff_right",
  "cheek_raiser_left",
  "cheek_raiser_right",
  "cheek_suck_left",
  "cheek_suck_right",
  "chin_raiser_bottom",
  "chin_raiser_top",
  "dimpler_left",
  "dimpler_right",
  "eyes_closed_left",
  "eyes_closed_right",
  "eyes_look_down_left",
  "eyes_look_down_right",
  "eyes_look_left_left",
  "eyes_look_left_right",
  "eyes_look_right_left",
  "eyes_look_right_right",
  "eyes_look_up_left",
  "eyes_look_up_right",
  "inner_brow_raiser_left",
  "inner_brow_raiser_right",
  "jaw_drop",
  "jaw_sideways_left",
  "jaw_sideways_right",
  "jaw_thrust",
  "lid_tightener_left",
  "lid_tightener_right",
  "lip_corner_depressor_left",
  "lip_corner_depressor_right",
  "lip_corner_puller_left",
  "lip_corner_puller_right",
  "lip_funneler_left_bottom",
  "lip_funneler_left_top",
  "lip_funneler_right_bottom",
  "lip_funneler_right_top",
  "lip_pressor_left",
  "lip_pressor_right",
  "lip_pucker_left",
  "lip_pucker_right",
  "lip_stretcher_left",
  "lip_stretcher_right",
  "lip_suck_left_bottom",
  "lip_suck_left_top",
  "lip_suck_right_bottom",
  "lip_suck_right_top",
  "lip_tightener_left",
  "lip_tightener_right",
  "lips_toward",
  "lower_lip_depressor_left",
  "lower_lip_depressor_right",
  "mouth_left",
  "mouth_right",
  "nose_wrinkler_left",
  "nose_wrinkler_right",
  "outer_brow_raiser_left",
  "outer_brow_raiser_right",
  "upper_lid_raiser_left",
  "upper_lid_raiser_right",
  "upper_lip_raiser_left",
  "upper_lip_raiser_right"
};

3.2. Visual examples of expressions

neutral
brow_lowerer_left
brow_lowerer_right
cheek_puff_left
cheek_puff_right
cheek_raiser_left
cheek_raiser_right
cheek_suck_left
cheek_suck_right
chin_raiser_bottom
chin_raiser_top
dimpler_left
dimpler_right
eyes_closed_left
eyes_closed_right
eyes_look_down_left
eyes_look_down_right
eyes_look_left_left
eyes_look_left_right
eyes_look_right_left
eyes_look_right_right
eyes_look_up_left
eyes_look_up_right
inner_brow_raiser_left
inner_brow_raiser_right
jaw_drop
jaw_sideways_left
jaw_sideways_right
jaw_thrust
lid_tightener_left
lid_tightener_right
lip_corner_depressor_left
lip_corner_depressor_right
lip_corner_puller_left
lip_corner_puller_right
lip_funneler_left_bottom
lip_funneler_left_top
lip_funneler_right_bottom
lip_funneler_right_top
lip_pressor_left
lip_pressor_right
lip_pucker_left
lip_pucker_right
lip_stretcher_left
lip_stretcher_right
lip_suck_left_bottom
lip_suck_left_top
lip_suck_right_bottom
lip_suck_right_top
lip_tightener_left
lip_tightener_right
lips_toward
lower_lip_depressor_left
lower_lip_depressor_right
mouth_left
mouth_right
nose_wrinkler_left
nose_wrinkler_right
outer_brow_raiser_left
outer_brow_raiser_right
upper_lid_raiser_left
upper_lid_raiser_right
upper_lip_raiser_left
upper_lip_raiser_right

3.3. XRExpressions

interface XRExpressions {
    iterable<XRExpression, float>;

    readonly attribute unsigned long size;
    float get(XRExpression key);
};

The XRExpression enum defines the various expressions that could be reported by the user agent.

Each XRExpressions object has a [[expressions]] internal slot, which is an ordered map of pairs with the key of type XRExpression and the value of type float. Each XRExpression MUST have a value between 0 and 1 with 0 being undected (or rest pose) and 1 the maximum expression.

The ordering of the [[expressions]] internal slot is given by the list of expressions.

[[expressions]] MAY change over the course of a session but MUST stay the same during the XRFrame.

The value pairs to iterate over for an XRExpressions object are the list of value pairs with the key being the XRExpression and the value being the float corresponding to that XRExpression, ordered by list of expressions.

If the user agent does not support or can report an expression defined, it MUST NOT be reported.

4. Frame Loop

4.1. XRFrame

partial interface XRFrame {
    readonly attribute XRExpressions? expressions;
};

5. Privacy & Security Considerations

The WebXR Expression Tracking API is a powerful feature that carries significant privacy risks.

Since this feature returns new sensor data, the User Agent MUST ask for explicit consent from the user at session creation time.

Data returned from this API, MUST NOT be so specific that one can detect individual users. If the underlying hardware returns data that is too precise, the User Agent MUST anonymize this data before revealing it through the WebXR Expression Tracking API.

This API MUST only be supported in XRSessions created with XRSessionMode of "immersive-vr" or "immersive-ar". "inline" sessions MUST not support this API.

When anonymizing the expression data, the UA can follow these guidelines:

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Conformant Algorithms

Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and abort these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.

Conformance requirements phrased as algorithms or specific steps can be implemented in any manner, so long as the end result is equivalent. In particular, the algorithms defined in this specification are intended to be easy to understand and are not intended to be performant. Implementers are encouraged to optimize.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[INFRA]
Anne van Kesteren; Domenic Denicola. Infra Standard. Living Standard. URL: https://infra.spec.whatwg.org/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[WEBIDL]
Edgar Chen; Timothy Gu. Web IDL Standard. Living Standard. URL: https://webidl.spec.whatwg.org/
[WEBXR]
Brandon Jones; Manish Goregaokar; Rik Cabanier. WebXR Device API. URL: https://immersive-web.github.io/webxr/
[WEBXR-AR-MODULE-1]
Brandon Jones; Manish Goregaokar; Rik Cabanier. WebXR Augmented Reality Module - Level 1. URL: https://immersive-web.github.io/webxr-ar-module/

Informative References

[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/

IDL Index

enum XRExpression {
  "brow_lowerer_left",
  "brow_lowerer_right",
  "cheek_puff_left",
  "cheek_puff_right",
  "cheek_raiser_left",
  "cheek_raiser_right",
  "cheek_suck_left",
  "cheek_suck_right",
  "chin_raiser_bottom",
  "chin_raiser_top",
  "dimpler_left",
  "dimpler_right",
  "eyes_closed_left",
  "eyes_closed_right",
  "eyes_look_down_left",
  "eyes_look_down_right",
  "eyes_look_left_left",
  "eyes_look_left_right",
  "eyes_look_right_left",
  "eyes_look_right_right",
  "eyes_look_up_left",
  "eyes_look_up_right",
  "inner_brow_raiser_left",
  "inner_brow_raiser_right",
  "jaw_drop",
  "jaw_sideways_left",
  "jaw_sideways_right",
  "jaw_thrust",
  "lid_tightener_left",
  "lid_tightener_right",
  "lip_corner_depressor_left",
  "lip_corner_depressor_right",
  "lip_corner_puller_left",
  "lip_corner_puller_right",
  "lip_funneler_left_bottom",
  "lip_funneler_left_top",
  "lip_funneler_right_bottom",
  "lip_funneler_right_top",
  "lip_pressor_left",
  "lip_pressor_right",
  "lip_pucker_left",
  "lip_pucker_right",
  "lip_stretcher_left",
  "lip_stretcher_right",
  "lip_suck_left_bottom",
  "lip_suck_left_top",
  "lip_suck_right_bottom",
  "lip_suck_right_top",
  "lip_tightener_left",
  "lip_tightener_right",
  "lips_toward",
  "lower_lip_depressor_left",
  "lower_lip_depressor_right",
  "mouth_left",
  "mouth_right",
  "nose_wrinkler_left",
  "nose_wrinkler_right",
  "outer_brow_raiser_left",
  "outer_brow_raiser_right",
  "upper_lid_raiser_left",
  "upper_lid_raiser_right",
  "upper_lip_raiser_left",
  "upper_lip_raiser_right"
};

interface XRExpressions {
    iterable<XRExpression, float>;

    readonly attribute unsigned long size;
    float get(XRExpression key);
};

partial interface XRFrame {
    readonly attribute XRExpressions? expressions;
};