INDEX
    Explanations

    This neuron activates on occurrences of the phrase “form of,” detecting that exact two‐word construction.

    New Auto-Interp
    Negative Logits
     Beats
    -0.07
    uppies
    -0.07
     even
    -0.06
     인정
    -0.06
    _esc
    -0.06
    fce
    -0.06
    NavigationItemSelectedListener
    -0.06
     základě
    -0.06
     Did
    -0.06
    ortho
    -0.06
    POSITIVE LOGITS
     mein
    0.07
    :'',
    0.07
     utf
    0.06
    expense
    0.06
     Smoke
    0.06
    $('
    0.06
    IFICATION
    0.06
    римін
    0.06
     testcase
    0.06
    $json
    0.06
    Act Density 0.009%

    No Known Activations