INDEX
    Explanations

    references to negative reactions or criticism

    New Auto-Interp
    Negative Logits
    316
    -0.15
    opic
    -0.15
    ilda
    -0.15
    å¯Ħ
    -0.15
    chip
    -0.14
    uldu
    -0.14
    iare
    -0.14
    agus
    -0.14
    æ³ķ人
    -0.14
    150
    -0.14
    POSITIVE LOGITS
    IPA
    0.15
    idden
    0.14
    رÙĥ
    0.14
    rek
    0.14
    edly
    0.14
    draft
    0.14
    eds
    0.14
    ãĥĪãĥ«
    0.13
     paddingRight
    0.13
    loon
    0.13
    Act Density 0.002%

    No Known Activations