INDEX
    Explanations

    wrong answers or options

    New Auto-Interp
    Negative Logits
    are
    0.60
    in
    0.56
    or
    0.55
     in
    0.53
    be
    0.53
    is
    0.51
    al
    0.50
     are
    0.50
    schlag
    0.50
     and
    0.50
    POSITIVE LOGITS
    0.54
    0.54
     batalha
    0.52
    तक
    0.52
    0.51
    UG
    0.50
    ünkü
    0.50
    າມາດ
    0.50
    अप्र
    0.50
    εται
    0.50
    Act Density 0.000%

    No Known Activations