INDEX
Explanations
phrases indicating ownership or possession
New Auto-Interp
Negative Logits
116
-0.16
身ä¸Ĭ
-0.15
zn
-0.15
esel
-0.14
ets
-0.14
einzel
-0.14
zac
-0.14
lea
-0.13
enga
-0.13
eden
-0.13
POSITIVE LOGITS
larger
0.25
wider
0.23
overall
0.21
overall
0.20
larg
0.20
Larger
0.20
ongoing
0.20
suite
0.20
effort
0.19
broader
0.19
Activations Density 0.041%