INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    gru
    -0.81
    ileaks
    -0.72
    Statement
    -0.70
     solicit
    -0.68
    cohol
    -0.67
     Guides
    -0.66
    IDER
    -0.65
    rea
    -0.64
     dancers
    -0.64
     discl
    -0.63
    POSITIVE LOGITS
     supremacy
    1.13
     foothold
    1.07
     victory
    1.04
    tein
    0.99
     throne
    0.97
     coveted
    0.97
     virginity
    0.92
     possession
    0.92
     custody
    0.91
     victories
    0.91
    Act Density 2.673%

    No Known Activations