INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    is
    0.50
    am
    0.38
     स्थानांतरित
    0.38
    en
    0.38
     Moving
    0.37
    getMax
    0.37
    م
    0.37
     _.
    0.37
     KNOW
    0.36
    kowe
    0.36
    POSITIVE LOGITS
     human
    0.99
     beings
    0.97
     manusia
    0.97
     humaine
    0.95
    Human
    0.95
    human
    0.91
     челове
    0.90
     humana
    0.90
     인간
    0.89
     чове
    0.88
    Act Density 0.042%

    No Known Activations