INDEX
Explanations
the word "paradox" and related variations
references to paradoxes
New Auto-Interp
Negative Logits
Interstitial
-0.85
gars
-0.75
gur
-0.67
astered
-0.67
RAW
-0.66
aina
-0.66
OTT
-0.64
VEL
-0.63
rations
-0.63
Ķ
-0.63
POSITIVE LOGITS
paradox
1.23
Paradox
1.15
ioned
1.05
ical
1.04
ically
0.98
ively
0.84
es
0.80
ional
0.78
puzz
0.75
ibly
0.75
Activations Density 0.008%