INDEX
    Explanations

    disruptions or negative effects

    This neuron responds to generic adverse‐effect phrases—especially constructions like “any setback” or “any disruption” indicating potential negative outcomes.

    New Auto-Interp
    Negative Logits
    이에
    -0.06
     Gale
    -0.06
     complement
    -0.06
     البته
    -0.06
    _classifier
    -0.06
    _CONTROLLER
    -0.06
    _likelihood
    -0.06
     embry
    -0.06
     Kadın
    -0.06
    (mode
    -0.06
    POSITIVE LOGITS
     přest
    0.07
     off
    0.07
    iliate
    0.06
    -fr
    0.06
     scooter
    0.06
     dedicated
    0.06
    ่งข
    0.06
    ा।↵↵
    0.06
    0.06
    :value
    0.06
    Act Density 0.064%

    No Known Activations