INDEX
Explanations
The neuron specifically fires on the causal connective “because.”
New Auto-Interp
Negative Logits
však
-0.06
theorem
-0.06
็อก
-0.06
недели
-0.06
Maar
-0.06
.appspot
-0.06
_five
-0.06
Ingram
-0.05
clock
-0.05
dock
-0.05
POSITIVE LOGITS
Rus
0.07
as
0.07
+");↵
0.07
effet
0.07
ru
0.07
influenza
0.07
「え
0.06
/apple
0.06
rim
0.06
figured
0.06
Activations Density 0.044%