INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    永久
    -0.09
     diets
    -0.08
     holy
    -0.08
    未来
    -0.07
    时代
    -0.07
    -disciplinary
    -0.07
     për
    -0.07
     Shed
    -0.07
     associated
    -0.07
     responsabilidades
    -0.07
    POSITIVE LOGITS
     сустав
    0.11
     вращ
    0.11
     груз
    0.10
     позвоноч
    0.09
     қыз
    0.08
     Chicken
    0.08
     гим
    0.08
     постепенно
    0.08
     Triangle
    0.08
     мах
    0.08
    Act Density 0.004%

    No Known Activations