INDEX
    Explanations

    The neuron activates on occurrences of the word “alarm” (or its plural/forms) in the text.

    New Auto-Interp
    Negative Logits
    Po
    -0.09
     Po
    -0.08
     Pays
    -0.07
    ingt
    -0.07
     От
    -0.07
     Fut
    -0.07
    .PreparedStatement
    -0.07
    ưu
    -0.07
     Sept
    -0.07
     Kot
    -0.07
    POSITIVE LOGITS
     alarm
    0.13
    alarm
    0.10
    Alarm
    0.10
    _ALARM
    0.09
     alarms
    0.09
     Alarm
    0.09
     alarming
    0.08
    horia
    0.08
     inflamm
    0.08
    0.07
    Act Density 0.002%

    No Known Activations