INDEX
    Explanations

    derogatory or insulting terms used to describe people

    New Auto-Interp
    Negative Logits
     spea
    -0.47
    ])->
    -0.45
    väl
    -0.45
    wię
    -0.45
    英語版
    -0.44
    }=\{
    -0.43
    Produzione
    -0.42
    SequentialGroup
    -0.42
    BeginInit
    -0.42
    δί
    -0.42
    POSITIVE LOGITS
     bastard
    1.04
     idiot
    0.94
     bastards
    0.94
     scum
    0.93
     Bastard
    0.92
     idiots
    0.90
     morons
    0.88
     moron
    0.87
     asshole
    0.86
    umbag
    0.86
    Act Density 0.225%

    No Known Activations