INDEX
    Explanations

    negative portrayals of individuals, particularly focusing on characteristics such as arrogance and hypocrisy

    New Auto-Interp
    Negative Logits
     iſt
    -0.71
     \\
    
    -0.68
    Hozzáférés
    -0.66
    -0.63
     Lister
    -0.60
     XNUMX
    -0.59
    ^(@)
    -0.57
     utafitiHapana
    -0.57
     ſind
    -0.56
    \\
    
    -0.54
    POSITIVE LOGITS
     IIRC
    0.65
     körül
    0.63
    TestingModule
    0.61
     culturelle
    0.59
    aurait
    0.59
     romero
    0.57
     annuel
    0.55
     ggf
    0.55
     OFDb
    0.55
     inderdaad
    0.55
    Act Density 0.611%

    No Known Activations