INDEX
Explanations
mentions of statements made by individuals
instances of the word "said."
New Auto-Interp
Negative Logits
=~=~
-0.86
asu
-0.82
ptives
-0.77
EDIT
-0.75
folios
-0.71
pleting
-0.71
ntil
-0.66
à¦
-0.65
EMBER
-0.65
ernels
-0.64
POSITIVE LOGITS
goodbye
1.24
doms
0.86
hello
0.83
aloud
0.80
mith
0.76
anecd
0.71
afterward
0.69
Goodbye
0.69
ieu
0.68
they
0.67
Activations Density 0.231%