INDEX
    Explanations

    non-English characters that are part of some kind of pattern or sequence

    special characters or symbols in the text

    New Auto-Interp
    Negative Logits
     Flavoring
    -0.98
    nings
    -0.97
    awaru
    -0.94
     contrace
    -0.85
    merce
    -0.83
     mathemat
    -0.75
    kef
    -0.73
    thodox
    -0.73
    holders
    -0.71
    ifice
    -0.71
    POSITIVE LOGITS
    ãĤ¡
    1.05
    ople
    0.90
    ر
    0.82
    urn
    0.80
    ι
    0.79
    âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
    0.75
    а
    0.75
    ¹
    0.75
    ern
    0.75
    inx
    0.75
    Act Density 0.009%

    No Known Activations