INDEX
Explanations
phrases indicative of occurrence, predictions, or conditions over time
New Auto-Interp
Negative Logits
orthand
-0.13
rezent
-0.13
izr
-0.13
еÑĤÑģÑı
-0.12
tual
-0.12
еÑĢÑĤа
-0.12
βάλ
-0.12
оваÑĤелÑĮ
-0.12
erken
-0.12
logic
-0.12
POSITIVE LOGITS
last
1.09
last
0.91
Last
0.88
Last
0.87
-last
0.84
_last
0.84
.last
0.82
LAST
0.82
last
0.76
(last
0.72
Activations Density 0.308%