INDEX
    Explanations

    The neuron activates on the word “serious,” detecting instances of that severity-indicating adjective.

    New Auto-Interp
    Negative Logits
     depleted
    -0.08
     lact
    -0.07
     canine
    -0.07
     Ten
    -0.06
     oat
    -0.06
    bohydr
    -0.06
    ANA
    -0.06
    upport
    -0.06
     Oct
    -0.06
    Ten
    -0.06
    POSITIVE LOGITS
     serious
    0.15
     seriously
    0.12
     seriousness
    0.10
     Serious
    0.09
     Seriously
    0.09
    0.09
    ris
    0.09
    serious
    0.08
     grave
    0.08
     karar
    0.08
    Act Density 0.008%

    No Known Activations