INDEX
Explanations
dates or time references in the text
occurrences of the word "when"
New Auto-Interp
Negative Logits
whatever
-0.78
ertain
-0.76
vre
-0.69
ouble
-0.68
urden
-0.67
\\\\\\\\
-0.67
ktop
-0.66
ivot
-0.65
atown
-0.64
nex
-0.64
POSITIVE LOGITS
soever
0.99
upon
0.93
someone
0.84
confronted
0.79
suddenly
0.75
they
0.74
hordes
0.73
somebody
0.72
reporters
0.71
researchers
0.70
Activations Density 0.108%