INDEX
Explanations
references to cynicism and related attitudes
New Auto-Interp
Negative Logits
Interstitial
-0.73
Columbia
-0.66
pains
-0.65
rers
-0.65
%%
-0.64
%%
-0.64
DERR
-0.63
TAMADRA
-0.62
Gadget
-0.60
LESS
-0.59
POSITIVE LOGITS
osure
1.30
ical
1.24
ophon
1.20
onymous
1.17
thia
1.13
aptic
1.09
ocent
1.08
opsis
1.07
ics
1.05
icians
1.04
Activations Density 0.007%