INDEX
Explanations
information related to historical events, political figures, and government activities
instances of dates or time-related phrases within a text
New Auto-Interp
Negative Logits
inar
-0.70
ertain
-0.67
mite
-0.67
iq
-0.66
eed
-0.64
olson
-0.64
geist
-0.64
imize
-0.63
ocations
-0.63
inate
-0.62
POSITIVE LOGITS
flanked
1.04
sparking
0.99
shortly
0.98
marking
0.97
citing
0.95
prompting
0.93
ostensibly
0.91
intending
0.90
coinc
0.89
accompanied
0.89
Activations Density 0.304%