INDEX
    Explanations

    collective experiences and shared human conditions

    New Auto-Interp
    Negative Logits
     diyor
    -0.51
     történ
    -0.45
     οποία
    -0.44
     kuiten
    -0.43
    上来
    -0.43
     cofre
    -0.43
     которое
    -0.42
    X
    -0.41
     kekerasan
    -0.41
     яке
    -0.41
    POSITIVE LOGITS
     humans
    0.94
     beginnetje
    0.89
    WebVitals
    0.82
    humans
    0.80
    Humans
    0.78
     human
    0.77
    ankind
    0.76
     Humans
    0.76
    DeleteBehavior
    0.73
     humankind
    0.71
    Act Density 0.572%

    No Known Activations