INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     enumerable
    0.89
     لازم
    0.84
     пищи
    0.82
     дыха
    0.82
     большая
    0.77
    லாத
    0.77
     قابل
    0.76
     रिक्वायरमेंट
    0.76
    ющиеся
    0.75
    তা
    0.75
    POSITIVE LOGITS
    alike
    1.52
     👀
    1.19
    andfeel
    1.19
     forward
    1.15
     intently
    1.11
     inward
    1.04
     outwards
    0.99
     inwards
    0.99
     into
    0.99
     دنبال
    0.99
    Act Density 0.078%

    No Known Activations