INDEX
    Explanations

    idiomatic punctuation and non-Latin scripts

    New Auto-Interp
    Negative Logits
     numerous
    0.65
     observable
    0.65
     eigenvectors
    0.63
     bequest
    0.63
    0.62
     demolished
    0.62
    0.62
    𝗥
    0.61
     dynamically
    0.60
    0.60
    POSITIVE LOGITS
    0.90
    ر
    0.89
    ل
    0.84
    м
    0.80
    ம்
    0.79
    ной
    0.79
    ת
    0.79
    ند
    0.76
    けた
    0.72
    ین
    0.71
    Act Density 0.258%

    No Known Activations