INDEX
Explanations
mentions or references to the word "design"
mentions of the term "design" and its various contexts
New Auto-Interp
Negative Logits
Lauder
-0.81
ega
-0.77
ICAN
-0.77
Cele
-0.77
rican
-0.67
ieri
-0.67
nikov
-0.66
ngth
-0.64
selves
-0.63
Sheen
-0.63
POSITIVE LOGITS
ating
1.10
ated
1.10
ators
1.09
ations
1.07
ator
1.06
aesthetic
0.96
ates
0.93
flaws
0.92
yout
0.92
flaw
0.90
Activations Density 0.066%