INDEX
    Explanations

    responses related to correctness and validity in a quiz-like context

    New Auto-Interp
    Negative Logits
    []]
    -0.50
    VersionUID
    -0.49
    ()]
    
    -0.45
    ]');
    -0.45
    رشف
    -0.44
    ]]]
    -0.43
    '};
    -0.43
    /');
    -0.41
    Vege
    -0.41
    kant
    -0.41
    POSITIVE LOGITS
     guesses
    0.90
     المعيارى
    0.87
     guessing
    0.85
     guess
    0.85
     guessed
    0.85
     تضيفلها
    0.82
    guess
    0.80
     Guess
    0.78
    UnsafeEnabled
    0.75
    Guess
    0.74
    Act Density 0.390%

    No Known Activations