INDEX
    Explanations

    affirmations or expressions of agreement

    New Auto-Interp
    Negative Logits
    tracted
    -0.16
    anca
    -0.15
    acie
    -0.15
    ãģ°ãģĭãĤĬ
    -0.15
    à¥ĩद
    -0.14
    umbo
    -0.14
    cı
    -0.14
     nowhere
    -0.14
    tn
    -0.14
    inciple
    -0.14
    POSITIVE LOGITS
     indeed
    0.40
     inde
    0.33
    enia
    0.30
    /no
    0.30
     sir
    0.29
     Virginia
    0.29
     Indeed
    0.29
    inde
    0.28
    Indeed
    0.27
     sire
    0.24
    Act Density 0.050%

    No Known Activations