INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    intenance
    -0.81
    ostante
    -0.80
    protoc
    -0.80
    ING
    -0.79
    (!__
    -0.75
    imedia
    -0.75
     незавершена
    -0.73
     réfugi
    -0.73
    BufferException
    -0.72
    клопе
    -0.71
    POSITIVE LOGITS
    y
    1.06
    e
    1.04
    i
    0.95
    a
    0.91
    an
    0.88
    o
    0.87
    s
    0.73
    u
    0.71
    k
    0.71
    t
    0.69
    Act Density 1.457%

    No Known Activations