INDEX
Explanations
phrases related to being under pressure or threat
New Auto-Interp
Negative Logits
QUI
-0.15
bak
-0.15
chas
-0.14
:checked
-0.14
Ïĥο
-0.14
ucher
-0.14
-api
-0.14
conc
-0.14
shal
-0.14
Nez
-0.14
POSITIVE LOGITS
suspicion
0.17
pressure
0.16
attack
0.16
ursor
0.15
647
0.15
TestClass
0.15
738
0.15
زار
0.15
ä¿ĿæĬ¤
0.15
hill
0.14
Activations Density 0.025%