INDEX
    Explanations

    asking yes/no questions

    New Auto-Interp
    Negative Logits
    z
    3.17
     testigo
    2.84
     tenta
    2.65
    2.65
    x
    2.56
    ו
    2.48
    te
    2.42
     découvert
    2.41
     able
    2.39
     tornar
    2.39
    POSITIVE LOGITS
    к
    3.53
     bộ
    2.77
    2.63
     Благодаря
    2.51
     reputation
    2.50
    2.46
    ია
    2.46
    2.44
    GUILayout
    2.40
    hedron
    2.35
    Act Density 0.395%

    No Known Activations