INDEX
Explanations
references to specific locations and geographic areas
New Auto-Interp
Negative Logits
kova
-0.15
ëĤ¨
-0.15
_exempt
-0.14
atte
-0.14
voks
-0.13
èŀº
-0.13
entry
-0.13
ournal
-0.13
antha
-0.13
Hur
-0.13
POSITIVE LOGITS
Awake
0.15
fcn
0.15
ä¿Ĭ
0.14
ãĥĥãĤ¯
0.14
inya
0.14
natives
0.14
pong
0.14
paren
0.14
uren
0.14
rgan
0.14
Activations Density 0.117%