INDEX
Explanations
phrases indicating potential or hypothetical situations involving "having" or "being."
New Auto-Interp
Negative Logits
now
-0.27
now
-0.26
lately
-0.20
yet
-0.20
-now
-0.19
ÑĤепеÑĢÑĮ
-0.19
hasn
-0.18
evidently
-0.18
apparently
-0.18
recently
-0.18
POSITIVE LOGITS
known
0.24
Known
0.20
expected
0.20
known
0.19
easily
0.18
been
0.18
Known
0.17
sooner
0.17
asily
0.17
guessed
0.17
Activations Density 0.079%