INDEX
Negative Logits
modifiers
-0.07
finished
-0.06
Constit
-0.06
contains
-0.06
speeches
-0.06
瓜
-0.06
prosecution
-0.06
馬
-0.06
violated
-0.06
flush
-0.06
POSITIVE LOGITS
kone
0.07
IN
0.06
cevap
0.06
Await
0.06
才能
0.06
Unt
0.06
VIC
0.06
chiropr
0.06
rapy
0.06
Merit
0.06
Activations Density 0.022%