INDEX
Negative Logits
!/
-0.09
Audrey
-0.09
ypos
-0.08
Chinatown
-0.08
obt
-0.07
she
-0.07
persecution
-0.07
arr
-0.07
elde
-0.07
arro
-0.07
POSITIVE LOGITS
knobs
0.08
المعرفة
0.08
gates
0.08
Babel
0.08
modifications
0.08
personalization
0.08
modifying
0.08
ritable
0.08
글로벌
0.08
<J
0.08
Activations Density 0.002%