INDEX
Negative Logits
out
-0.08
_Response
-0.07
Baldwin
-0.06
Batı
-0.06
Có
-0.06
_organization
-0.06
ола
-0.06
intensified
-0.06
wid
-0.06
Intro
-0.06
POSITIVE LOGITS
mişti
0.07
emplate
0.07
arseille
0.06
those
0.06
Long
0.06
↵
0.06
.usage
0.06
.Internal
0.06
sugars
0.06
داشتند
0.06
Activations Density 0.008%