INDEX
Explanations
references and citations
The neuron activates on author surname tokens in bibliographic citations.
New Auto-Interp
Negative Logits
Aure
-0.06
giveaway
-0.06
αυτά
-0.06
Spike
-0.06
แ
-0.06
Africa
-0.06
Bernard
-0.06
Kernel
-0.06
cameras
-0.06
foreground
-0.06
POSITIVE LOGITS
14
0.07
icipant
0.06
着
0.06
.cn
0.06
وزيع
0.06
nationalists
0.06
dad
0.06
095
0.06
лись
0.06
특별
0.06
Activations Density 0.024%