INDEX
Explanations
technical writing
The neuron activates on words that denote foundational concepts—specifically “basic” and “principles.”
New Auto-Interp
Negative Logits
.ci
-0.07
_One
-0.07
Propagation
-0.07
uyệ
-0.06
pare
-0.06
alers
-0.06
้ด
-0.06
overseas
-0.06
унок
-0.06
sizlik
-0.06
POSITIVE LOGITS
."
0.07
.”
0.07
)."
0.06
”↵
0.06
","
0.06
,',
0.06
)/(
0.06
.”↵↵
0.06
"),↵
0.06
"}
0.06
Activations Density 0.000%