INDEX
    Explanations

    This neuron detects words that express evaluation of quality or performance, such as “adequacy” and “effectiveness.”

    New Auto-Interp
    Negative Logits
     bait
    -0.07
    (chalk
    -0.07
    096
    -0.06
    494
    -0.06
    Low
    -0.06
    Accessible
    -0.06
     corrupted
    -0.06
     fossil
    -0.06
     updateTime
    -0.06
    mousedown
    -0.06
    POSITIVE LOGITS
    Việc
    0.07
    ีช
    0.07
     último
    0.07
    ुलन
    0.07
    .'));↵
    0.06
    orang
    0.06
     akan
    0.06
     keras
    0.06
     удал
    0.06
     Nẵng
    0.06
    Act Density 0.085%

    No Known Activations