INDEX
    Explanations

    phrases related to political accusations or controversies

    New Auto-Interp
    Negative Logits
    allo
    -0.17
    /Instruction
    -0.15
    μα
    -0.15
    pga
    -0.15
    ofil
    -0.14
    enden
    -0.14
    lds
    -0.14
    uset
    -0.14
    .resolve
    -0.13
    894
    -0.13
    POSITIVE LOGITS
     column
    0.19
     columns
    0.19
     Slate
    0.18
     Salon
    0.18
     essays
    0.17
    column
    0.17
     columnist
    0.17
     Atlantic
    0.17
     salon
    0.16
     Establishment
    0.16
    Act Density 0.181%

    No Known Activations