INDEX
    Explanations

    actions taken by individuals

    New Auto-Interp
    Negative Logits
    CHED
    -0.14
    нÑıÑĤ
    -0.14
    Scalars
    -0.14
    Permanent
    -0.14
    itus
    -0.14
    omer
    -0.13
     Pom
    -0.13
     totiž
    -0.13
    au
    -0.13
    ulin
    -0.13
    POSITIVE LOGITS
    strup
    0.18
     pant
    0.17
    isque
    0.15
    ippers
    0.14
     sorte
    0.14
    apg
    0.14
    urette
    0.14
    uddy
    0.14
    yor
    0.13
     Impl
    0.13
    Act Density 0.254%

    No Known Activations