INDEX
Explanations
elements related to specific years or dates
New Auto-Interp
Negative Logits
iola
-0.18
din
-0.15
iert
-0.15
eph
-0.15
kate
-0.14
елов
-0.14
shr
-0.14
inclusive
-0.14
iar
-0.14
esta
-0.14
POSITIVE LOGITS
adolu
0.16
flair
0.15
OWER
0.14
olk
0.14
omas
0.14
Commun
0.13
[--
0.13
ãģ£ãģį
0.13
deÅŁ
0.13
oppon
0.13
Activations Density 0.003%