INDEX
Explanations
instances that emphasize urgency or a call to action
New Auto-Interp
Negative Logits
Franch
-0.85
snail
-0.75
seiz
-0.69
obser
-0.69
disse
-0.68
antip
-0.65
dissemin
-0.63
mainland
-0.61
promul
-0.61
stomp
-0.61
POSITIVE LOGITS
Ŀ
1.74
¡
1.43
¤
1.20
¦
1.19
Ĵ
1.17
Ķ
1.16
ľ
1.15
ĺ
1.13
ª
1.13
ŀ
1.13
Activations Density 0.715%