INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     Dies
    -0.07
    orch
    -0.07
     yaklaş
    -0.06
     وز
    -0.06
    Submission
    -0.06
     بج
    -0.06
     radios
    -0.06
    цион
    -0.06
     cực
    -0.06
    POSITIVE LOGITS
     modern
    0.07
     вкус
    0.06
     ANT
    0.06
     ´
    0.06
     sought
    0.06
     maior
    0.06
     İmparator
    0.06
     world
    0.06
     Cald
    0.06
    (result
    0.06
    Act Density 0.040%

    No Known Activations