INDEX
Explanations
phrases indicating temporal references or changes related to events
New Auto-Interp
Negative Logits
Lastly
-0.86
)))
-0.81
"}],"
-0.78
"]
-0.77
"))
-0.77
))))
-0.77
DES
-0.75
Repeat
-0.73
"!
-0.72
trump
-0.72
POSITIVE LOGITS
starters
0.72
industrialized
0.69
sofar
0.65
sooner
0.62
polls
0.60
verning
0.60
older
0.60
oret
0.59
fewer
0.58
longtime
0.58
Activations Density 0.456%