INDEX
Explanations
words and phrases related to architecture and titles associated with it
New Auto-Interp
Negative Logits
orf
-0.18
zers
-0.17
quot
-0.15
itor
-0.15
osity
-0.14
oil
-0.14
arium
-0.14
arian
-0.14
izer
-0.14
040
-0.14
POSITIVE LOGITS
ipel
0.29
itect
0.23
uate
0.20
etypes
0.19
etype
0.19
aic
0.19
angel
0.19
itecture
0.19
bishop
0.18
(es
0.17
Activations Density 0.020%