INDEX
Explanations
statements or pieces of information that have been explicitly declared or affirmed
instances of the word "stated"
New Auto-Interp
Negative Logits
brance
-0.71
vernment
-0.68
sacked
-0.67
xtap
-0.67
agra
-0.66
Carbuncle
-0.63
abies
-0.63
sites
-0.63
gone
-0.63
illard
-0.62
POSITIVE LOGITS
gow
0.92
stated
0.83
bluntly
0.75
stating
0.73
defaults
0.72
:"
0.71
orical
0.71
states
0.70
:]
0.70
mentions
0.69
Activations Density 0.028%