INDEX
Explanations
instances of the word "pose" or its variations, particularly in contexts involving photos or appearances
New Auto-Interp
Negative Logits
spath
-0.16
zeug
-0.16
lä
-0.15
lej
-0.15
tre
-0.14
odge
-0.14
iego
-0.14
');?>"
-0.14
lea
-0.14
uir
-0.13
POSITIVE LOGITS
pose
0.27
idon
0.26
pose
0.20
Pos
0.20
=pos
0.20
Pose
0.19
threat
0.18
posed
0.18
poses
0.18
poses
0.18
Activations Density 0.014%