INDEX
    Explanations

    The neuron activates on comparative/superlative words and phrases signaling evaluation or “what works best.”

    New Auto-Interp
    Negative Logits
     sıcak
    -0.06
    .Small
    -0.06
     Ivory
    -0.06
    хов
    -0.06
     بای
    -0.06
    .Absolute
    -0.06
    ?>">↵
    -0.06
    三三
    -0.06
    erialize
    -0.06
     slaughtered
    -0.05
    POSITIVE LOGITS
     disaster
    0.08
    esModule
    0.07
     monument
    0.07
     monuments
    0.07
     Homo
    0.07
    FileSync
    0.07
    ρευ
    0.07
     ster
    0.07
     компании
    0.06
    чних
    0.06
    Act Density 0.013%

    No Known Activations