INDEX
    Explanations

    phrases related to political statements or claims

    New Auto-Interp
    Negative Logits
    ernel
    -0.15
    linger
    -0.15
    vore
    -0.14
    κη
    -0.14
     Zeit
    -0.14
    ิà¸Ķ
    -0.14
     гÑĢÑĥн
    -0.14
     exile
    -0.14
    zeit
    -0.14
     Shirley
    -0.13
    POSITIVE LOGITS
     Fake
    0.20
     Radical
    0.17
     Rig
    0.16
    Fake
    0.16
    476
    0.16
    è«ĩ
    0.15
    cher
    0.14
    ISTA
    0.14
     Rip
    0.14
    okia
    0.14
    Act Density 0.049%

    No Known Activations