INDEX
    Explanations

    personal pronouns

    New Auto-Interp
    Negative Logits
    િને
    -0.08
    ,为
    -0.08
    М
    -0.08
     budu
    -0.08
    GI
    -0.08
    -0.08
    SETS
    -0.08
     pole
    -0.07
    里面
    -0.07
     dient
    -0.07
    POSITIVE LOGITS
     вообще
    0.09
     acaso
    0.08
     কখন
    0.08
     ?↵↵
    0.08
     locality
    0.08
     Ee
    0.07
     PAP
    0.07
     ever
    0.07
    .destination
    0.07
    really
    0.07
    Act Density 0.100%

    No Known Activations