INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     the
    -0.71
     a
    -0.70
    s
    -0.65
    a
    -0.55
    sb
    -0.53
    ity
    -0.53
    ovsk
    -0.52
     its
    -0.51
    stood
    -0.51
    isbury
    -0.51
    POSITIVE LOGITS
    قایناق‌لار
    0.71
     المعيارى
    0.71
     myſelf
    0.68
     beginnetje
    0.65
     againſt
    0.65
     pleaſure
    0.65
    tvguidetime
    0.63
     Houſe
    0.63
     beſt
    0.63
     ſever
    0.63
    Act Density 1.659%

    No Known Activations