INDEX
    Explanations

    phrases related to political and controversial statements

    New Auto-Interp
    Negative Logits
    creen
    -0.82
     ABE
    -0.71
     Tasman
    -0.70
     Belg
    -0.69
    wagen
    -0.68
     shroud
    -0.67
     Mirage
    -0.67
    iewicz
    -0.67
     destro
    -0.66
     Reprodu
    -0.64
    POSITIVE LOGITS
    ª
    1.33
    ł
    1.27
    IJ
    1.24
    ij
    1.17
    ¹
    1.10
    ı
    1.08
    Ĵ
    1.07
    «
    1.02
    ¤
    1.01
    ľ
    1.01
    Act Density 0.785%

    No Known Activations