INDEX
Explanations
architecture details
The main thing this neuron does is detect references to architectural style and building‐feature terminology.
New Auto-Interp
Negative Logits
Front
-0.07
Macros
-0.07
tiết
-0.07
_sms
-0.06
parks
-0.06
countered
-0.06
keine
-0.06
elo
-0.06
».↵↵
-0.06
Nintendo
-0.06
POSITIVE LOGITS
weit
0.07
futuro
0.07
aer
0.06
uster
0.06
Aeros
0.06
dolay
0.06
$("0.06
\uC
0.06
côt
0.06
’app
0.06
Activations Density 0.011%