INDEX
    Explanations

    mentions of specific entities or organizations mentioned in the text

    New Auto-Interp
    Negative Logits
     ayaa
    -0.78
     umo
    -0.75
     saad
    -0.69
     naer
    -0.68
     karna
    -0.67
     cabrio
    -0.67
     kark
    -0.66
     saha
    -0.65
     hej
    -0.65
     pank
    -0.65
    POSITIVE LOGITS
     itself
    0.51
    Skocz
    0.51
    '
    0.50
    Enregistrer
    0.48
    <bos>
    0.48
    0.48
     Wtf
    0.46
    uksessa
    0.43
    $'
    0.43
    Xoxo
    0.43
    Act Density 0.330%

    No Known Activations