INDEX
Explanations
This neuron activates on occurrences of the phrase “form of,” detecting that exact two‐word construction.
New Auto-Interp
Negative Logits
Beats
-0.07
uppies
-0.07
even
-0.06
인정
-0.06
_esc
-0.06
fce
-0.06
NavigationItemSelectedListener
-0.06
základě
-0.06
Did
-0.06
ortho
-0.06
POSITIVE LOGITS
mein
0.07
:'',
0.07
utf
0.06
expense
0.06
Smoke
0.06
$('0.06
IFICATION
0.06
римін
0.06
testcase
0.06
$json
0.06
Activations Density 0.009%