INDEX
Explanations
The neuron specializes in spotting the phrase “method of” (i.e. occurrences of “method” immediately followed by “of”).
New Auto-Interp
Negative Logits
аков
-0.07
riends
-0.06
urate
-0.06
composers
-0.06
anship
-0.06
ウ
-0.06
ecess
-0.06
olf
-0.06
eno
-0.05
.iv
-0.05
POSITIVE LOGITS
SetTitle
0.07
'';↵
0.07
rsa
0.07
') ↵
0.07
تنظيف
0.06
Закону
0.06
navigate
0.06
exemptions
0.06
:↵
0.06
( ↵
0.06
Activations Density 0.006%