INDEX
    Explanations

    political and news-related terms and phrases

    New Auto-Interp
    Negative Logits
    xual
    -0.72
    ones
    -0.67
    TEXTURE
    -0.67
    76561
    -0.65
    tions
    -0.64
    :-
    -0.64
    AAF
    -0.62
     Anon
    -0.61
    lihood
    -0.60
    naires
    -0.59
    POSITIVE LOGITS
     smartest
    0.66
     celeb
    0.65
    anooga
    0.62
    Inside
    0.61
     digest
    0.59
    inion
    0.58
     Kavanaugh
    0.57
     noon
    0.57
     uncover
    0.57
    akespeare
    0.57
    Act Density 0.112%

    No Known Activations