INDEX
    Explanations

    negations or negative qualifiers in the text

    New Auto-Interp
    Negative Logits
    adera
    -0.16
    ittal
    -0.16
    æ¡Ī
    -0.16
    èŃ
    -0.15
    ãĥ¼ãĥ¬
    -0.14
    angelo
    -0.14
    اÙĦÙĩ
    -0.14
    ÃŃst
    -0.13
    ise
    -0.13
    ummer
    -0.13
    POSITIVE LOGITS
    .scalablytyped
    0.16
    èo
    0.15
    olls
    0.15
    icari
    0.15
    çļ
    0.14
     chaud
    0.14
    azen
    0.14
    Verdana
    0.13
    æİ
    0.13
    vail
    0.13
    Act Density 0.005%

    No Known Activations