INDEX
Explanations
proper nouns, particularly related to locations and organizations
New Auto-Interp
Negative Logits
celik
-0.16
afil
-0.15
helicopt
-0.15
viso
-0.15
lij
-0.15
olist
-0.14
ÏģαÏĤ
-0.13
šti
-0.13
efa
-0.13
>NN
-0.13
POSITIVE LOGITS
-
0.55
-
0.41
–
0.29
-↵
0.27
--
0.26
_-_
0.24
{-0.23
âĪĴ
0.22
()-
0.21
-↵↵
0.21
Activations Density 0.175%