SO-101 · Robotics · Forward Kinematics

From joint angles
to end-effector pose

A visual build-up: one link, then two, then three — and how each step translates directly into code.

chapter 01 One link, one revolute joint

Start with the simplest case: a single rigid link of length L, attached to the world at one end by a revolute joint that rotates by angle θ about the Z-axis. Where is the other end?

diagram — drag the slider
θ (joint) 35°
L (length) 1.0
joint (world origin)
link
end-effector
step 1 — the transform matrix

A single transform T maps any point in the end-effector frame to the world frame. It packs rotation + translation together:

T = [ R | t ] — 3×3 homogeneous (2D):
cos θ
−sin θ
L·cos θ
sin θ
cos θ
L·sin θ
0
0
1
live values (θ = 35°, L = 1.0):
0
0
1
EE position: x = —, y = —
What the matrix tells you: The blue 2×2 block is the rotation — it says "the end-effector's x-axis points in direction (cos θ, sin θ) in world coordinates." The green column is the translation — it's exactly where the end-effector origin is. The bottom row [0 0 1] is always fixed. Points are represented as (x, y, 1) — that fake extra 1 is what activates the translation column.
step 2 — the code (2D, 3×3 matrix)
def get_T(theta_deg, L): c = np.cos(np.deg2rad(theta_deg)) s = np.sin(np.deg2rad(theta_deg)) # 2D homogeneous: 3×3 # point (x,y) is represented as (x, y, 1) T = np.array([[c, -s, L*c], # rotation | translation [s, c, L*s], [0, 0, 1]]) # always [0, 0, 1] return T # End-effector position is the last column (first 2 rows): position = T[0:2, 2] # [tx, ty]
Notice: the translation column [L·cosθ, L·sinθ, 0] is just basic trigonometry — the tip of a link of length L at angle θ. The matrix is just a convenient way to package this.

chapter 02 Two links — chaining transforms

Now add a second link. Joint 2 sits at the tip of link 1, and can rotate independently by angle θ₂. Where is the new tip?

We can't just add angles — the second joint's displacement is expressed in frame 1's coordinates, not the world. We need to transform it. That's exactly what matrix multiplication does.

two-link arm — drag both sliders
θ₁ (joint 1) 30°
θ₂ (joint 2) 50°
T₁ — world to joint 1
describes link 1 (length L₁, angle θ₁)
c₁
−s₁
L₁c₁
s₁
c₁
L₁s₁
0
0
1
Joint 2 position: x = —, y = —
T₁₂ — joint 1 to joint 2
describes link 2 (length L₂, angle θ₂ relative to link 1)
c₂
−s₂
L₂c₂
s₂
c₂
L₂s₂
0
0
1
local offset of EE from joint 2
Key question: T₁₂ gives the EE position in link 1's frame. To get it in the world frame, we premultiply by T₁. That cancels "link 1's frame" from both sides, leaving the world frame.
T_world→EE = T₁ · T₁₂
the chain product — live values:
0
0
1
EE world position: x = —, y = —
the code — two links (2D, 3×3)
def get_T1(theta1_deg, L1): c, s = np.cos(np.deg2rad(theta1_deg)), np.sin(np.deg2rad(theta1_deg)) return np.array([[c, -s, L1*c], [s, c, L1*s], [0, 0, 1]]) def get_T12(theta2_deg, L2): # same structure — local frame! c, s = np.cos(np.deg2rad(theta2_deg)), np.sin(np.deg2rad(theta2_deg)) return np.array([[c, -s, L2*c], [s, c, L2*s], [0, 0, 1]]) # Chain them — this is the forward kinematics: T_world_ee = get_T1(theta1, L1) @ get_T12(theta2, L2) position = T_world_ee[0:2, 2] # top of last column = EE in world
The pattern: both functions look identical — same structure, same assembly. The only difference is which angle and which length they take. Each function only "knows about" its own joint. The chain product handles the rest.

chapter 03 Three links — and a fixed displacement

The SO-101 has one more ingredient: each joint transform also includes a fixed displacement — the physical offset between two joint axes that doesn't depend on any joint angle. This is just the link geometry: how far apart the screws are.

The displacement goes in the translation column, alongside the rotation. The joint angle only affects the rotation block.

three-link arm with fixed offsets — all three sliders
θ₁ (joint 1) 20°
θ₂ (joint 2) 40°
θ₃ (joint 3) -30°
what each transform looks like — with displacement

For joint 2 of SO-101, the displacement d = (−0.0304, −0.0183, −0.0542) m is the fixed vector from joint 1's axis to joint 2's axis. The joint angle only enters the rotation block:

T₁₂ =
c₂
−s₂
0
d_x
s₂
c₂
0
d_y
0
0
1
d_z
0
0
0
1

The green column [d_x, d_y, d_z] is hardcoded from the URDF — it never changes. Only the blue block rotates.

live chain product — T₁ · T₁₂ · T₂₃
0
0
1
EE world position: x = —, y = —
the code — three links, with displacement
def get_T12(theta2_deg): # Fixed displacement — from URDF, never changes displacement = np.array([-0.0304, -0.0183, -0.0542]).reshape(3,1) # Rotation — only this changes with the joint angle c, s = np.cos(np.deg2rad(theta2_deg)), np.sin(np.deg2rad(theta2_deg)) R = np.array([[c, -s, 0], [s, c, 0], [0, 0, 1]]) # Assemble the 4×4 block T = np.block([[R, displacement], [0, 0, 0, 1]]) return T # Full chain — same @ operator, just more terms: T_world_ee = get_T1(th1) @ get_T12(th2) @ get_T23(th3) position = T_world_ee[0:3, 3] rotation = T_world_ee[0:3, 0:3]

chapter 04 The SO-101 pattern

The SO-101 has one extra ingredient on top of what we've seen: axis-alignment rotations. In a perfectly designed robot, every joint would rotate about the same axis in the same direction. In a real robot, the joints point in different directions, so each transform needs a fixed "setup" rotation before the joint variable is applied.

what we've done so far
R = Rz(θ)

Joint variable only. Works when all joints rotate about the same axis.

SO-101 pattern
R = R_align · Rz(θ)

Fixed alignment first, then joint variable. The alignment comes from the URDF.

For example, get_gw1 uses Rz(180) @ Rx(180) as the alignment — these two constant rotations flip the Z-axis to point upward from the robot base. Then Rz(θ₁) rotates the shoulder pan on top of that.

the full recipe — every get_gXY in the codebase
def get_gXY(theta_deg): # ① Fixed — from URDF geometry displacement = (dx, dy, dz) # ② Fixed — re-orient axes so the joint spins about the right direction # ③ Variable — the actual joint angle, always Rz(θ) for SO-101 rotation = R_align @ Rz(theta_deg) # ④ Assemble pose = np.block([[rotation, np.array(displacement).reshape(3,1)], [0, 0, 0, 1]]) return pose # Forward kinematics — chain all six transforms: T = gw1 @ g12 @ g23 @ g34 @ g45 @ g5t position = T[0:3, 3] # where is the tip? rotation = T[0:3, 0:3] # how is the tip oriented?
The chain rule in one sentence: T_w1 carries everything downstream with it — when joint 1 rotates, joints 2–5 and the tool tip all move together, because T_w1 is the leftmost factor in the product. Each joint only "knows about" its own angle. The multiplication handles how they compose.
summary — what each part of the matrix does
Matrix part What it encodes Changes with θ? In code
R (3×3 top-left) How the child frame is oriented relative to the parent Yes — via Rz(θ) R_align @ Rz(theta_deg)
t (3×1 top-right) Where the child frame's origin is in the parent frame No — fixed geometry displacement tuple
[0 0 0 1] (bottom) Homogeneous convention — always this, enables matrix chaining Never hardcoded 0,0,0,1

chapter 05 From 2D to 3D — why the matrix grows

Everything in chapters 1–4 was 2D: one plane, one rotation axis, 3×3 matrices. The SO-101 lives in 3D space. Here is exactly what changes — and what stays the same.

why the homogeneous trick exists at all

A plain rotation matrix can never translate, because it always maps the origin to the origin. Multiplying any matrix M by the zero vector gives zero — no matter what M is.

❌ rotation only (2×2) — can't translate
c
−s
s
c
·
x
y
=
cx−sy
sx+cy

At (0,0): result is always (0,0). Can't shift the origin.

✓ homogeneous (3×3) — rotation + translation
c
−s
tx
s
c
ty
0
0
1
·
x
y
1
=
cx−sy+tx
sx+cy+ty
1

The fake 1 "activates" the translation column. The point is still 2D — you ignore the 1 in the output.

The +1 rule: n-dimensional space → (n+1)×(n+1) homogeneous transform. 2D needs 3×3. 3D needs 4×4. The bottom row is always [0 … 0 1].
side-by-side: 2D (3×3) vs 3D (4×4)
2D homogeneous — 3×3
point: (x, y, 1)
cos θ
−sin θ
tx
sin θ
cos θ
ty
0
0
1
R block: 2×2 — one rotation axis (Z only)
t column: 2 values — (tx, ty)
bottom row: [0, 0, 1]
3D homogeneous — 4×4
point: (x, y, z, 1)
r₁₁
r₁₂
r₁₃
tx
r₂₁
r₂₂
r₂₃
ty
r₃₁
r₃₂
r₃₃
tz
0
0
0
1
R block: 3×3 — can rotate about X, Y, or Z
t column: 3 values — (tx, ty, tz)
bottom row: [0, 0, 0, 1]
what changes: the rotation block

In 2D there is only one way to rotate — in the XY plane. In 3D you can rotate about any axis, so you need three elementary rotation matrices instead of one:

Rx(θ)
1
0
0
0
c
−s
0
s
c
X stays fixed
Y/Z mix
Ry(θ)
c
0
s
0
1
0
−s
0
c
Y stays fixed
X/Z mix
Rz(θ)
c
−s
0
s
c
0
0
0
1
Z stays fixed
X/Y mix

Notice: Rz is identical to the 2D rotation matrix, just padded with a third row/column. Rotating in the XY plane is the same thing whether you're in 2D or 3D — you just now also have a Z axis that doesn't move.

code: 2D vs 3D transform function
2D — 3×3 (what chapters 1–4 use)
T = np.array([ [c, -s, tx], [s, c, ty], [0, 0, 1] ]) # extract position: pos = T[0:2, 2] # [tx, ty]
3D — 4×4 (the SO-101 code)
T = np.block([ [R, t ], # R is 3×3 [0, 0, 0, 1] # t is 3×1 ]) # extract position: pos = T[0:3, 3] # [tx, ty, tz]
Everything else is the same. The chain multiplication rule is identical: T_world→EE = T₁ · T₁₂ · … The bottom row is always zeros + 1. The translation is always the last column. The rotation block is always the top-left square. Only the sizes change: 3×3 → 4×4, and the rotation block gains the ability to express rotations about all three axes, which is why you need Rx, Ry, Rz in 3D but only one rotation matrix in 2D.