INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ly
    -0.64
    fully
    -0.56
    vin
    -0.56
    tably
    -0.56
    lah
    -0.55
    δη
    -0.55
     segre
    -0.53
    oly
    -0.53
    sing
    -0.52
    /
    -0.52
    POSITIVE LOGITS
     للمعارف
    1.00
     resourceCulture
    0.96
     виправивши
    0.86
     esternos
    0.85
     nahilalakip
    0.83
     &___
    0.83
    SBATCH
    0.82
     Ones
    0.81
     antaranya
    0.81
    standers
    0.77
    Act Density 0.122%

    No Known Activations