INDEX
    Explanations

    phrases indicating susceptibility or vulnerability to various issues

    New Auto-Interp
    Negative Logits
    uably
    -0.91
    arta
    -0.84
    notations
    -0.81
    roy
    -0.75
    Registered
    -0.75
    ä
    -0.73
    miah
    -0.72
    cade
    -0.71
     pictured
    -0.71
    Leary
    -0.70
    POSITIVE LOGITS
     criticism
    0.98
     attack
    0.97
     temptation
    0.96
     ridicule
    0.94
     attacks
    0.94
     withstand
    0.93
     fend
    0.92
     resist
    0.91
     manipulation
    0.90
     extinction
    0.90
    Act Density 0.049%

    No Known Activations