INDEX
Negative Logits
sayı
-0.07
int
-0.07
成功
-0.07
dots
-0.06
_hop
-0.06
committee
-0.06
pretended
-0.06
----------------------------------------------------------------
-0.06
brittle
-0.06
_hello
-0.06
POSITIVE LOGITS
idiots
0.07
_until
0.06
locator
0.06
وم
0.06
тие
0.06
Traffic
0.06
Dep
0.06
av
0.06
druh
0.06
ỉ
0.06
Activations Density 0.023%