INDEX
    Explanations

    phrases related to actions accompanied by consequences or reactions

    instances of reported actions or statements from the subject about threats and requests

    New Auto-Interp
    Negative Logits
    llah
    -0.67
    cation
    -0.65
    isal
    -0.64
    etheless
    -0.63
    mania
    -0.62
    aml
    -0.62
    panic
    -0.62
    illo
    -0.62
    culus
    -0.61
    lance
    -0.58
    POSITIVE LOGITS
     themselves
    1.07
     selves
    0.89
    selves
    0.77
    MpServer
    0.68
     helmets
    0.67
     mouths
    0.66
     uniforms
    0.64
     microphones
    0.60
     orbits
    0.60
     jointly
    0.60
    Act Density 0.834%

    No Known Activations