INDEX
    Explanations

    affirmations of correctness and agreement in statements

    New Auto-Interp
    Negative Logits
    ilst
    -0.17
    oder
    -0.16
    otto
    -0.14
    åde
    -0.14
    lse
    -0.14
    reau
    -0.14
    ute
    -0.14
    rop
    -0.14
    away
    -0.14
    121
    -0.13
    POSITIVE LOGITS
     about
    0.27
     tentang
    0.17
    obuf
    0.16
    /rfc
    0.16
     correct
    0.15
    emand
    0.15
    .syntax
    0.15
    åħ³äºİ
    0.15
    ermo
    0.15
    About
    0.14
    Act Density 0.041%

    No Known Activations