INDEX
Explanations
phrases indicating specific locations or measurements
New Auto-Interp
Negative Logits
416
-0.17
nic
-0.17
ali
-0.16
ioc
-0.15
io
-0.15
ious
-0.15
Eisen
-0.14
iser
-0.14
wich
-0.14
rage
-0.14
POSITIVE LOGITS
eye
0.23
angles
0.20
ataka
0.19
Eye
0.19
waist
0.17
chest
0.17
wa
0.16
eye
0.16
-eye
0.16
full
0.16
Activations Density 0.088%