INDEX
    Explanations

    Racism, violence, republicans

    New Auto-Interp
    Negative Logits
    ed
    -0.98
    Violence
    -0.90
     Racism
    -0.88
    e
    -0.84
     Violence
    -0.84
     Morality
    -0.82
     violence
    -0.82
     Fascism
    -0.82
    violence
    -0.80
    i
    -0.77
    POSITIVE LOGITS
    änä
    0.46
    OIR
    0.45
     Know
    0.44
    PreferredItem
    0.44
     geleden
    0.42
    fixtures
    0.42
    örté
    0.42
     saites
    0.42
     ago
    0.41
    ács
    0.41
    Act Density 1.083%

    No Known Activations