INDEX
    Explanations

    commas followed by common words

    items following punctuation

    New Auto-Interp
    Negative Logits
    y
    1.61
    l
    1.59
    ע
    1.52
    ی
    1.39
    u
    1.20
    s
    1.17
    c
    1.13
    b
    1.05
    t
    1.02
    ى
    1.01
    POSITIVE LOGITS
     for
    1.12
     Perché
    0.92
    ка
    0.91
     absurd
    0.87
     expedit
    0.87
    జేపీ
    0.84
     grizz
    0.84
     médicos
    0.83
     cristianos
    0.82
     abhor
    0.82
    Act Density 0.054%

    No Known Activations