INDEX
    Explanations

    racism and related concepts

    New Auto-Interp
    Negative Logits
    均匀
    0.56
    0.55
    0.55
     Humidity
    0.54
     ምግብ
    0.52
    0.52
    各种
    0.52
     Geschwindigkeit
    0.50
     utilisée
    0.50
    季节
    0.50
    POSITIVE LOGITS
     полити
    0.62
    民主
    0.62
    political
    0.61
     демокра
    0.60
     politische
    0.60
     political
    0.59
     activism
    0.59
    ,
    0.58
    政治
    0.57
     apartheid
    0.57
    Act Density 0.001%

    No Known Activations