INDEX
    Explanations

    phrases indicating personal beliefs or opinions

    New Auto-Interp
    Negative Logits
     desn
    -0.17
    uki
    -0.15
    ovÃŃ
    -0.15
    à¥ģà¤Ĺ
    -0.15
    aucoup
    -0.15
    umba
    -0.14
    Ñıд
    -0.14
    azzo
    -0.14
     reverse
    -0.14
     Reverse
    -0.14
    POSITIVE LOGITS
    otel
    0.18
     hast
    0.16
    eln
    0.16
    iams
    0.15
    ë°©
    0.15
    -char
    0.14
     Ink
    0.14
    akov
    0.14
    dd
    0.14
    elo
    0.14
    Act Density 0.065%

    No Known Activations