INDEX
Explanations
words related to events or actions sequentially or temporally ordered
instances of the word "after."
New Auto-Interp
Negative Logits
ahime
-0.84
olds
-0.80
cci
-0.79
arnaev
-0.79
ãĥ¯
-0.78
ouble
-0.77
guiActiveUn
-0.76
çīĪ
-0.75
JV
-0.74
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.71
POSITIVE LOGITS
noticing
1.30
spotting
1.28
discovering
1.23
witnessing
1.22
receiving
1.20
completing
1.16
seeing
1.15
arriving
1.15
hearing
1.13
realizing
1.12
Activations Density 0.108%