INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     застав
    -0.07
    Endian
    -0.07
     tied
    -0.07
    380
    -0.07
     Scor
    -0.06
     ô
    -0.06
     throat
    -0.06
     forced
    -0.06
     backwards
    -0.06
     frat
    -0.06
    POSITIVE LOGITS
     published
    0.18
     publish
    0.14
     Published
    0.13
    Published
    0.12
     publishing
    0.12
     publisher
    0.11
     publication
    0.11
     publishers
    0.10
    published
    0.10
     Publishing
    0.10
    Act Density 0.030%

    No Known Activations