INDEX
    Explanations

    which or what questions

    New Auto-Interp
    Negative Logits
     prostu
    0.50
     ибо
    0.49
     lihtsalt
    0.47
    தையும்
    0.45
     magari
    0.44
     outset
    0.44
     quizá
    0.43
     appunto
    0.43
     일단
    0.42
     yaptı
    0.42
    POSITIVE LOGITS
     Which
    1.70
    Which
    1.68
     کدام
    1.15
     What
    1.12
    What
    1.11
     which
    1.09
    which
    1.02
     WHICH
    1.02
    0.99
    哪个
    0.98
    Act Density 0.080%

    No Known Activations