INDEX
Explanations
punctuation
This neuron responds to mentions of “friend” or “friends” in the text.
New Auto-Interp
Negative Logits
[layer
-0.07
PP
-0.07
*
-0.06
Sandy
-0.06
Diagram
-0.06
Tabs
-0.06
réalis
-0.06
alignments
-0.06
Capitol
-0.06
Ax
-0.06
POSITIVE LOGITS
.Connect
0.06
/report
0.06
ZW
0.06
_RECEIVED
0.06
#import
0.06
общ
0.06
grub
0.06
carga
0.06
무료
0.06
tzv
0.06
Activations Density 0.017%