INDEX
Explanations
possessive pronouns followed by body parts
New Auto-Interp
Negative Logits
outperformed
0.53
your
0.52
voted
0.46
Your
0.43
您
0.43
inizin
0.42
pushed
0.42
categorized
0.42
%
0.42
pecific
0.42
POSITIVE LOGITS
nostrils
0.45
semblables
0.43
eyelids
0.42
cellules
0.40
ankles
0.40
flancs
0.40
이는
0.40
own
0.40
}$;
0.40
atrium
0.39
Activations Density 0.079%