INDEX
    Explanations

    articles/pronouns

    New Auto-Interp
    Negative Logits
    txt
    -0.07
    Sense
    -0.07
     restau
    -0.07
     authority
    -0.07
    ジオ
    -0.06
     вед
    -0.06
     FDA
    -0.06
     ноября
    -0.06
     vip
    -0.06
    들이
    -0.06
    POSITIVE LOGITS
    Gem
    0.07
    Mage
    0.06
    507
    0.06
    ::$
    0.06
    ={}
    0.06
     Recently
    0.06
    Recently
    0.06
     prelim
    0.06
    landırma
    0.06
    /Base
    0.06
    Act Density 0.044%

    No Known Activations