INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     From
    -0.92
    ↵↵
    -0.90
              
    -0.89
    From
    -0.88
    ليق
    -0.87
    あらゆる
    -0.86
            
    -0.86
     folgen
    -0.84
     Pretty
    -0.84
     rupture
    -0.84
    POSITIVE LOGITS
    Spieler
    1.05
    wachsenen
    1.04
     innych
    0.99
    beforeEach
    0.98
    ilado
    0.96
    ilares
    0.95
    haften
    0.94
     tibur
    0.90
    Fach
    0.88
    tuo
    0.88
    Act Density 0.140%

    No Known Activations