INDEX
    Explanations

    abbreviations and acronyms

    New Auto-Interp
    Negative Logits
     I
    1.55
    ς
    1.11
    teras
    1.00
    н
    0.98
    liche
    0.86
    și
    0.86
    )</
    0.85
    aktif
    0.82
    </th>
    0.80
    ACT
    0.79
    POSITIVE LOGITS
    ي
    1.66
    1.61
    на
    1.53
    x
    1.51
    ش
    1.50
    ان
    1.48
    1.48
    י
    1.47
    ן
    1.47
    the
    1.46
    Act Density 0.000%

    No Known Activations