INDEX
Explanations
references to individual items or instances presented sequentially
"at a time"
New Auto-Interp
Negative Logits
to
-0.47
res
-0.42
too
-0.42
no
-0.42
i
-0.40
(
-0.38
_
-0.37
Sp
-0.37
too
-0.36
-0.36
POSITIVE LOGITS
time
2.48
time
1.85
TIME
1.68
Time
1.60
Time
1.49
TIME
1.47
tyme
1.38
tijd
1.37
времени
1.34
vez
1.28
Activations Density 0.153%