INDEX
    Explanations

    This neuron activates strongly on phrases that state a “most common” or similar superlative category description.

    New Auto-Interp
    Negative Logits
     Papers
    -0.07
     painting
    -0.07
    160
    -0.07
    ables
    -0.06
    /Runtime
    -0.06
    -0.06
     village
    -0.06
     Removal
    -0.06
     Sessions
    -0.06
    ledger
    -0.06
    POSITIVE LOGITS
    Estimated
    0.06
    ("{}
    0.06
    ported
    0.06
     },{↵
    0.06
     буд
    0.06
     зн
    0.06
     ('$
    0.06
    .ERR
    0.05
    ंपर
    0.05
    .ST
    0.05
    Act Density 0.058%

    No Known Activations