INDEX
Explanations
events or processes that started at a specific point in time
words that signify the initiation of events or actions
New Auto-Interp
Negative Logits
congratulated
-0.66
aths
-0.66
ib
-0.64
preferred
-0.62
reused
-0.62
body
-0.60
phe
-0.59
elected
-0.59
talk
-0.58
mented
-0.57
POSITIVE LOGITS
anew
1.25
innoc
0.96
prematurely
0.83
raining
0.82
abruptly
0.81
WithNo
0.78
unfolding
0.77
igmatic
0.77
happening
0.76
occurring
0.76
Activations Density 0.063%