INDEX
    Explanations

    phrases indicating restrictions or limitations in communication

    New Auto-Interp
    Negative Logits
    eyin
    -0.07
    zano
    -0.07
    argin
    -0.07
     seins
    -0.07
    ãģ¤ãģij
    -0.07
    zim
    -0.06
    ä¹³
    -0.06
    ccion
    -0.06
    ãģıãģł
    -0.06
    ãģ¤ãģ¶
    -0.06
    POSITIVE LOGITS
     publicly
    0.07
    ussen
    0.07
     freely
    0.07
    ucher
    0.07
    nap
    0.06
    ccount
    0.06
     quot
    0.06
     ÑģамоÑģÑĤ
    0.06
     Bale
    0.06
     Gus
    0.06
    Act Density 0.001%

    No Known Activations