INDEX
Explanations
words related to design and architecture
references to design concepts and practices
New Auto-Interp
Negative Logits
ieri
-0.80
Lauder
-0.78
nikov
-0.78
ICAN
-0.78
Sheen
-0.74
essional
-0.74
Witnesses
-0.74
Ago
-0.74
Arist
-0.73
selves
-0.72
POSITIVE LOGITS
ators
1.04
yout
0.98
ator
0.97
ations
0.89
design
0.86
ates
0.86
ating
0.86
flaw
0.85
ated
0.84
designs
0.80
Activations Density 0.035%