INDEX
    Explanations

    special characters and punctuation

    New Auto-Interp
    Negative Logits
     of
    1.45
     at
    1.44
     that
    1.28
     be
    1.27
     was
    1.23
     it
    1.17
     {
    1.14
    いた
    1.06
     for
    1.02
     to
    0.94
    POSITIVE LOGITS
    1.43
    其他
    0.90
    .
    0.90
    iv
    0.82
    ו
    0.78
    ip
    0.77
    -
    0.77
    j
    0.77
    in
    0.76
    u
    0.75
    Act Density 1.698%

    No Known Activations