INDEX
    Explanations

    scientific names and people

    New Auto-Interp
    Negative Logits
    ỳnh
    0.49
    សម្រាប់ការ
    0.38
    streetlight
    0.37
    ongyang
    0.37
    ўным
    0.37
    0.37
    junit
    0.36
    0.36
    食べて
    0.36
    ສົ່ງ
    0.36
    POSITIVE LOGITS
     flops
    0.46
    ходит
    0.44
     flop
    0.44
     человека
    0.43
     człowie
    0.42
     Mensch
    0.41
     человеку
    0.41
     godz
    0.41
     Behör
    0.41
    Name
    0.39
    Act Density 0.000%

    No Known Activations