INDEX
    Explanations

    relies on, blocking, equal

    New Auto-Interp
    Negative Logits
     вопросы
    0.45
     нового
    0.43
     Hỏi
    0.43
     preguntas
    0.41
    ابق
    0.41
    мены
    0.40
     Evaluación
    0.38
     вопросов
    0.38
     вопросам
    0.38
     questions
    0.38
    POSITIVE LOGITS
    yes
    0.71
    Yes
    0.68
     yes
    0.65
    YES
    0.64
     Yes
    0.63
     YES
    0.57
    lowest
    0.48
    stretched
    0.45
    justify
    0.44
    indeed
    0.44
    Act Density 0.000%

    No Known Activations