INDEX
Explanations
repeated occurrences of the word "ever."
New Auto-Interp
Negative Logits
eric
-0.18
ively
-0.18
side
-0.17
arella
-0.16
emean
-0.16
quine
-0.16
们
-0.16
sg
-0.15
uras
-0.15
ermen
-0.15
POSITIVE LOGITS
greens
0.24
-present
0.19
green
0.19
theless
0.18
lasting
0.18
last
0.18
hone
0.17
leigh
0.16
igth
0.16
flo
0.16
Activations Density 0.030%