INDEX
    Explanations

    negative sentiments or phrases indicating disapproval

    New Auto-Interp
    Negative Logits
    lobals
    -0.18
     McB
    -0.18
    arhus
    -0.17
    shal
    -0.16
     æ¼Ķ
    -0.14
    anco
    -0.14
    abr
    -0.14
    ãĥĨãĥ«
    -0.14
    éry
    -0.14
    ãĥĩãĥ«
    -0.14
    POSITIVE LOGITS
    anger
    0.19
    æľĹ
    0.15
    perator
    0.15
    CG
    0.15
    ÙIJÙĥ
    0.15
     Nob
    0.15
    nave
    0.15
    翼
    0.15
    aset
    0.15
    ANGER
    0.15
    Act Density 0.035%

    No Known Activations