INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     około
    1.35
     حوالي
    1.34
     около
    1.31
     один
    1.30
     några
    1.30
     только
    1.26
     одного
    1.24
     одну
    1.24
     gần
    1.17
     aproximadamente
    1.16
    POSITIVE LOGITS
     defamation
    1.01
     explanations
    1.00
     explanation
    0.97
    তাবাদী
    0.95
    তাকে
    0.94
     staircase
    0.94
     commentators
    0.93
     limbs
    0.92
    limb
    0.91
     spirits
    0.91
    Act Density 0.021%

    No Known Activations