INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ματος
    -0.07
    Mixed
    -0.07
     iki
    -0.06
    られる
    -0.06
    لمات
    -0.06
    (drop
    -0.06
    Hop
    -0.06
     ciclo
    -0.06
     تماس
    -0.06
    Animations
    -0.06
    POSITIVE LOGITS
     قهر
    0.07
     fitted
    0.06
     governed
    0.06
     Jeho
    0.06
     constructor
    0.06
     unstable
    0.06
    ightly
    0.06
    0.06
     disadv
    0.06
     affect
    0.06
    Act Density 0.070%

    No Known Activations