INDEX
    Explanations

    words related to a specific designation or labeling system

    New Auto-Interp
    Negative Logits
     étoient
    -0.61
     Infór
    -0.59
     Italij
    -0.57
     recargable
    -0.56
     plegable
    -0.56
     enfans
    -0.55
     hendes
    -0.55
     født
    -0.54
     виправивши
    -0.54
    :✨
    -0.54
    POSITIVE LOGITS
     Af
    1.13
    Af
    1.02
     af
    0.95
     AF
    0.94
    af
    0.93
    AF
    0.85
     afl
    0.62
     AFR
    0.61
    afs
    0.59
     Afr
    0.57
    Act Density 0.424%

    No Known Activations