INDEX
Explanations
first occurrences and time-related references in the text
New Auto-Interp
Negative Logits
zin
-0.18
ë¬
-0.14
Feder
-0.14
ÑĢаб
-0.14
Bout
-0.14
à¹ĭ
-0.13
resp
-0.13
deny
-0.13
elite
-0.13
ãĥ¼ãĥĸ
-0.13
POSITIVE LOGITS
lessly
0.15
preferredStyle
0.15
igh
0.15
olare
0.14
опол
0.14
grily
0.14
sap
0.14
uluk
0.14
eko
0.14
eward
0.14
Activations Density 0.313%