INDEX
Explanations
mentions of specific years or numerical references in the text
New Auto-Interp
Negative Logits
774
-0.15
lem
-0.14
ItemType
-0.14
íĸī
-0.14
Ìĥ
-0.14
STATES
-0.14
deep
-0.13
erli
-0.13
rina
-0.13
šov
-0.13
POSITIVE LOGITS
asse
0.16
mada
0.16
iner
0.15
indy
0.15
enu
0.15
season
0.14
ulis
0.14
azu
0.14
asn
0.14
efa
0.14
Activations Density 0.062%