INDEX
    Explanations

    terms related to opinions, judgments, and qualifications

    phrases or references to groups of people collectively

    New Auto-Interp
    Negative Logits
    isal
    -0.74
    cation
    -0.68
    mania
    -0.66
    llah
    -0.66
    pex
    -0.66
    dden
    -0.65
    etheless
    -0.65
    aml
    -0.64
    mire
    -0.64
    ertodd
    -0.63
    POSITIVE LOGITS
     themselves
    1.25
     selves
    1.08
    selves
    1.04
     helmets
    0.79
    MpServer
    0.78
     individually
    0.77
     uniforms
    0.77
     mouths
    0.76
     necks
    0.72
     jointly
    0.71
    Act Density 0.858%

    No Known Activations