INDEX
    Explanations

    political terms and affiliations

    New Auto-Interp
    Negative Logits
    oval
    -0.60
    achine
    -0.57
     eco
    -0.55
     Pacific
    -0.55
     retro
    -0.54
     reluct
    -0.54
     revenge
    -0.53
    FI
    -0.53
     vengeance
    -0.53
     unbeliev
    -0.51
    POSITIVE LOGITS
    ().
    0.83
     attRot
    0.79
     counterparts
    0.78
     anymore
    0.77
    *.
    0.76
    !.
    0.73
    ":[
    0.73
    .
    0.73
    .'
    0.71
     existed
    0.71
    Act Density 0.268%

    No Known Activations