A Playground to controle a vroid avatar with user pose estimation.
-
Real-time avatar control
- Uses MediaPipe Holistic / Thunder model as a pose detector.
- Tracks body, face, and both hands from the webcam feed.
-
VRoid / VRM support
- Loads a VRoid avatar (
VRoid_Woman.vrm) viaGLTFLoaderandVRMLoaderPlugin. - Applies Kalidokitβs solved pose/face/hand data to the humanoid rig.
- Loads a VRoid avatar (
-
Runs entirely in the browser
- No backend is required for pose estimation or avatar control.
- All processing stays on the client for privacy.
-
Expo + React Native for Web
- Implemented as an Expo app and exported to static web.
- Easy to run in development mode on mobile or web.
Web camera
β
@mediapipe/pose on the frontend
β
poseLandmarks + poseWorldLandmarks
β
PosePerson-compatible object
β
Kalidokit.Pose.solve(..., { runtime: "mediapipe", enableLegs: true })
β
VRM humanoid normalized bones
β
Three.js render loop
Important points:
- The current implementation is frontend-only.
- MediaPipe Pose is loaded from CDN:
@mediapipe/pose@0.5. - OpenPose itself is not executed in the browser.
- BODY_25-like points are kept mainly for debug display, not as the main VRM-driving path.
- The avatar-driving path should match the Kalidokit-based path used by
pose_estimation.
At a high level, the pipeline is:
-
Webcam capture
- The browser captures your webcam stream (portrait-oriented, e.g. 480Γ640).
-
MediaPipe Holistic (Thunder)
- Holistic is dynamically loaded from a CDN at runtime to avoid bundler issues like
"Holistic is not a constructor". - It produces:
- 2D and 3D pose landmarks
- Face landmarks
- Left / right hand landmarks
- Holistic is dynamically loaded from a CDN at runtime to avoid bundler issues like
-
Kalidokit solving
- Kalidokit consumes the landmarks and solves:
- RiggedPose (body / hips / spine / limbs)
- RiggedFace (eyes, mouth, head rotation, etc.)
- RiggedHand for left and right hands
- Kalidokit consumes the landmarks and solves:
-
VRM rigging (Three.js + @pixiv/three-vrm)
- A VRM avatar is loaded and normalized (using
VRMUtils.removeUnnecessaryJoints()andVRMUtils.rotateVRM0()). - The solved Kalidokit data is applied to the VRM humanoid bones (hips, spine, arms, fingers, etc.).
- A Three.js render loop updates the avatar every frame.
- A VRM avatar is loaded and normalized (using
-
UI / controls
- A simple settings bar lets you:
- Start/stop camera and tracking
- See status (camera / Holistic / VRM)
- Open links (e.g. repository, demo)
- A simple settings bar lets you:
-
Frontend
- Expo / React Native
- React Native for Web
- TypeScript / TSX
-
3D & Avatar
- Three.js
@pixiv/three-vrmfor VRM avatarsGLTFLoader+VRMLoaderPlugin
-
Pose & Animation
- MediaPipe Holistic (Thunder model)
- Kalidokit for rigging
-
Tooling / Infra
- Docker & Docker Compose
- GitHub Actions (CI, Docker tests, GitHub Pages deployment)
The sample avatar in this repository is provided by γγ‘γΏγγΌγΉγ¨γ³γΉγ«γ.
For more details, see: https://metaverse-yokosuka.com/
The avatar is a VRM humanoid model loaded through Three.js:
const loader = new GLTFLoader();
loader.register((parser) => new VRMLoaderPlugin(parser));After loading, the avatar is normalized and placed in the scene:
VRMUtils.removeUnnecessaryVertices(gltf.scene);
VRMUtils.removeUnnecessaryJoints(gltf.scene);
vrm.scene.position.set(0, -1.05, 0);
vrm.scene.rotation.set(0, Math.PI, 0);
vrm.scene.scale.setScalar(1.0);
vrm.humanoid.resetNormalizedPose();
vrm.humanoid.update();Key findings:
- Keep the VRM root transform stable.
- Do not move the whole avatar every frame using
Kalidokit.Hips.position. - Use VRM humanoid normalized bone names, not raw Three.js skeleton bone names.
- Apply rotations through
vrm.humanoid.getNormalizedBoneNode(...).
The TypeScript implementation uses these VRM normalized humanoid bone names:
const VRM_BONE_NAMES = {
hips: "hips",
spine: "spine",
chest: "chest",
upperChest: "upperChest",
neck: "neck",
head: "head",
leftUpperArm: "leftUpperArm",
leftLowerArm: "leftLowerArm",
leftHand: "leftHand",
rightUpperArm: "rightUpperArm",
rightLowerArm: "rightLowerArm",
rightHand: "rightHand",
leftUpperLeg: "leftUpperLeg",
leftLowerLeg: "leftLowerLeg",
leftFoot: "leftFoot",
rightUpperLeg: "rightUpperLeg",
rightLowerLeg: "rightLowerLeg",
rightFoot: "rightFoot",
};The application path is:
const bone = vrm.humanoid.getNormalizedBoneNode(boneName);
bone.rotation.set(x, y, z, "XYZ");This is different from directly manipulating arbitrary Three.js skeleton bones. A visible SkeletonHelper confirms that the VRM skeleton exists, but it does not prove that the avatar is being driven correctly through the VRM humanoid normalized bones.
MediaPipe Pose returns 33 landmarks. For debug display and comparison, the app also builds BODY_25-like names:
const BODY25_TO_MEDIAPIPE = {
Nose: 0,
Neck: null,
RShoulder: 12,
RElbow: 14,
RWrist: 16,
LShoulder: 11,
LElbow: 13,
LWrist: 15,
MidHip: null,
RHip: 24,
RKnee: 26,
RAnkle: 28,
LHip: 23,
LKnee: 25,
LAnkle: 27,
REye: 5,
LEye: 2,
REar: 8,
LEar: 7,
LBigToe: 31,
LHeel: 29,
RBigToe: 32,
RHeel: 30,
};Synthetic helper joints:
Neckis the average of left shoulder and right shoulder.MidHipis the average of left hip and right hip.
Important conclusion:
- BODY_25-like bones are useful for visual debugging.
- They should not be mixed with Kalidokit Euler rotations as another avatar-control layer.
- Mixing BODY_25 retargeting and Kalidokit retargeting creates inconsistent arm, elbow, and leg axes.
This project uses three different coordinate spaces:
| Space | Data | Meaning |
|---|---|---|
| MediaPipe 2D | poseLandmarks |
Normalized image-space coordinates. x and y are screen-based. |
| MediaPipe world | poseWorldLandmarks |
3D-like landmarks used by Kalidokit. |
| Three.js / VRM | Avatar scene | Y-up world. The VRM root is fixed at (0, -1.05, 0). |
The avatar-control path requires both landmark sets:
person.mediapipeWorldLandmarks
person.mediapipeLandmarksThen Kalidokit is called as follows:
Kalidokit.Pose.solve(
person.mediapipeWorldLandmarks,
person.mediapipeLandmarks,
{
runtime: "mediapipe",
enableLegs: true,
}
);If poseWorldLandmarks is missing, a faithful pose_estimation-style path should not silently substitute 2D landmarks as world landmarks. That may make the avatar move, but it corrupts the coordinate system.
The stable application order is:
const solvedPose = Kalidokit.Pose.solve(...);
vrm.humanoid.resetNormalizedPose();
applySolvedPose(vrm, solvedPose);
vrm.humanoid.update();applySolvedPose maps Kalidokit output to VRM normalized bones:
applyRotation(vrm, "hips", solvedPose.Hips.rotation, 0.35);
applyRotation(vrm, "spine", solvedPose.Spine, 0.35);
applyRotation(vrm, "chest", solvedPose.Chest ?? solvedPose.Spine, 0.25);
applyRotation(vrm, "upperChest", solvedPose.Chest ?? solvedPose.Spine, 0.2);
applyRotation(vrm, "neck", solvedPose.Neck, 0.35);
applyRotation(vrm, "head", solvedPose.Head, 0.5);
applyRotation(vrm, "leftUpperArm", solvedPose.LeftUpperArm, 1.0);
applyRotation(vrm, "leftLowerArm", solvedPose.LeftLowerArm, 1.0);
applyRotation(vrm, "leftHand", solvedPose.LeftHand, 0.7);
applyRotation(vrm, "rightUpperArm", solvedPose.RightUpperArm, 1.0);
applyRotation(vrm, "rightLowerArm", solvedPose.RightLowerArm, 1.0);
applyRotation(vrm, "rightHand", solvedPose.RightHand, 0.7);
applyRotation(vrm, "leftUpperLeg", solvedPose.LeftUpperLeg, 0.8);
applyRotation(vrm, "leftLowerLeg", solvedPose.LeftLowerLeg, 0.8);
applyRotation(vrm, "leftFoot", solvedPose.LeftFoot, 0.6);
applyRotation(vrm, "rightUpperLeg", solvedPose.RightUpperLeg, 0.8);
applyRotation(vrm, "rightLowerLeg", solvedPose.RightLowerLeg, 0.8);
applyRotation(vrm, "rightFoot", solvedPose.RightFoot, 0.6);Findings from debugging:
- Reconstructing upper/lower arm rotations manually from
shoulder β elbow β wristis fragile unless the rest axis and parent-local frame are calibrated exactly. - The active
pose_estimationVRM path is Kalidokit-based. - Do not apply both custom BODY_25 quaternions and Kalidokit rotations to the same bones in the same frame.
# set environment variables:
export REACT_NATIVE_PACKAGER_HOSTNAME=${YOUR_HOST}
# Build the image
docker compose build
# Run the container
docker compose updocker compose \
-f docker-compose.test.yml up \
--build --exit-code-from \
frontend_test- Apache License 2.0