INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    outlined
    0.74
     utilización
    0.71
    Power
    0.71
    おそらく
    0.67
     زیاد
    0.66
    Muchas
    0.65
    Deployment
    0.64
    许多
    0.64
    Use
    0.64
    >≥</
    0.63
    POSITIVE LOGITS
     back
    1.44
     Back
    1.27
     BACK
    1.05
     tornare
    1.04
    Back
    1.04
     terug
    0.92
     backs
    0.90
    back
    0.90
    に戻
    0.89
     tilbake
    0.87
    Act Density 0.000%

    No Known Activations