INDEX
Explanations
references to academic articles and various types of bibliographic information
New Auto-Interp
Negative Logits
odiac
-0.15
coni
-0.15
getti
-0.15
ufen
-0.15
orgen
-0.14
859
-0.14
éİ
-0.14
721
-0.14
æ²ī
-0.14
rof
-0.14
POSITIVE LOGITS
inv
0.17
Bul
0.17
ама
0.17
Tat
0.16
hir
0.16
abox
0.16
Pal
0.15
Pal
0.15
isty
0.15
бол
0.15
Activations Density 0.027%