INDEX
Explanations
references to design principles and challenges
references to design in various contexts
New Auto-Interp
Negative Logits
nikov
-0.83
Lauder
-0.80
Witnesses
-0.78
selves
-0.76
essional
-0.75
Arist
-0.74
Ago
-0.73
ieri
-0.73
ICAN
-0.73
Sheen
-0.72
POSITIVE LOGITS
ators
1.03
ator
0.96
yout
0.96
design
0.90
ations
0.87
ating
0.85
designs
0.85
ates
0.82
flaw
0.80
designer
0.78
Activations Density 0.033%