INDEX
Explanations
Questions
The neuron is primarily responsive to personal pronouns used in the text.
New Auto-Interp
Negative Logits
contemporary
-0.07
clipse
-0.07
style
-0.06
pojištění
-0.06
석
-0.06
WISE
-0.06
fine
-0.06
المع
-0.06
กลาง
-0.06
trained
-0.06
POSITIVE LOGITS
nin
0.07
drafting
0.06
ملی
0.06
âu
0.06
roducing
0.06
&);↵↵
0.06
Priority
0.06
uede
0.06
سین
0.06
Bezier
0.06
Activations Density 0.051%