INDEX
    Explanations

    foreign characters from specific languages, such as Serbian and Italian

    exclamation marks and special characters

    New Auto-Interp
    Negative Logits
    ĸļ
    -0.78
    selage
    -0.77
    ortmund
    -0.77
    unciation
    -0.77
    sonian
    -0.76
    atis
    -0.75
    abase
    -0.74
     Gutenberg
    -0.74
    orno
    -0.73
    hof
    -0.73
    POSITIVE LOGITS
    dating
    1.04
    ï¸ı
    0.92
    coming
    0.83
    ban
    0.82
    ward
    0.79
    LOAD
    0.78
    stairs
    0.77
    dates
    0.75
    lishes
    0.74
    mit
    0.72
    Act Density 0.007%

    No Known Activations