INDEX
Explanations
The neuron detects the document’s production/proofreading header metadata (lines like “produced by… Proofreading Team at…The Internet Archive”).
New Auto-Interp
Negative Logits
Το
-0.06
pouch
-0.06
ردد
-0.06
قام
-0.06
başv
-0.06
amer
-0.06
!!,
-0.06
Beat
-0.06
_TYPED
-0.06
řik
-0.06
POSITIVE LOGITS
elle
0.07
spacious
0.07
rbrace
0.07
eline
0.07
loạt
0.06
grad
0.06
facilit
0.06
Less
0.06
.weights
0.06
올
0.06
Activations Density 0.002%