INDEX
    Explanations

    punctuation and separators

    New Auto-Interp
    Negative Logits
    '
    1.27
    "
    1.01
    📠
    0.98
    🕚
    0.97
    🌔
    0.97
    🚋
    0.97
    _'
    0.96
    📼
    0.96
    ).
    0.95
    🏺
    0.93
    POSITIVE LOGITS
    е
    1.21
    1.12
    or
    1.11
    ाई
    1.11
     Это
    1.11
     baratos
    1.07
     رفض
    1.07
    ens
    1.06
     vrste
    1.06
     Cualquier
    1.05
    Act Density 0.714%

    No Known Activations