INDEX
    Explanations

    The neuron is specifically searching for words related to tear gas

    references to tears and emotional distress

    New Auto-Interp
    Negative Logits
    atar
    -0.84
    orea
    -0.72
    ancial
    -0.71
    stood
    -0.70
    eport
    -0.68
    raviolet
    -0.66
    enza
    -0.66
    ammy
    -0.66
    ocre
    -0.65
    enhagen
    -0.65
    POSITIVE LOGITS
    bows
    1.04
    bow
    0.95
    ful
    0.89
    stals
    0.81
    fully
    0.80
    bian
    0.80
    stained
    0.78
     tears
    0.73
    iffs
    0.73
    stal
    0.72
    Act Density 0.016%

    No Known Activations