INDEX
Negative Logits
date
-0.08
words
-0.07
Oper
-0.07
date
-0.07
were
-0.07
link
-0.07
Net
-0.06
coke
-0.06
veri
-0.06
Ryan
-0.06
POSITIVE LOGITS
انگ
0.07
inclination
0.07
olean
0.07
ulating
0.06
stockholm
0.06
930
0.06
葡
0.06
etically
0.06
าช
0.06
"'.
0.06
Activations Density 0.086%