INDEX
Explanations
phrases indicating something is recently established or newly classified
New Auto-Interp
Negative Logits
ekler
-0.19
’ta
-0.16
zon
-0.16
udu
-0.15
erken
-0.15
.ua
-0.15
AMESPACE
-0.15
STS
-0.14
ernes
-0.14
ota
-0.14
POSITIVE LOGITS
ly
0.18
iber
0.18
bian
0.17
ighton
0.17
bies
0.15
mente
0.15
iger
0.15
ko
0.14
N
0.14
æļ
0.14
Activations Density 0.014%