INDEX
Explanations
the repeated phrase "every single" followed by various nouns
phrases emphasizing singularity or individual importance
New Auto-Interp
Negative Logits
ctr
-0.77
eln
-0.73
ello
-0.73
ÃĥÃĤ
-0.67
aba
-0.67
srfAttach
-0.66
orthy
-0.65
laden
-0.64
chi
-0.64
dor
-0.63
POSITIVE LOGITS
THING
1.08
imaginable
0.99
conceivable
0.90
goddamn
0.88
facet
0.88
WHERE
0.82
inch
0.80
penny
0.77
grain
0.76
godd
0.76
Activations Density 0.046%