INDEX
    Explanations

    references to the United Nations (UN)

    the presence of references to the United Nations

    New Auto-Interp
    Negative Logits
     Feldman
    -0.72
     conservatism
    -0.69
     constituency
    -0.65
     fascism
    -0.65
     shave
    -0.62
     goodbye
    -0.62
     Slate
    -0.62
     laz
    -0.62
     everything
    -0.61
     Sel
    -0.61
    POSITIVE LOGITS
    UN
    4.08
    UNE
    1.91
     UN
    1.78
    un
    1.65
    uns
    1.59
    UNCH
    1.39
    UL
    1.39
    unt
    1.38
    UM
    1.35
    OU
    1.34
    Act Density 0.011%

    No Known Activations