INDEX
    Explanations

    conditional phrases that pose questions about various scenarios or choices

    New Auto-Interp
    Negative Logits
     niet
    -0.20
     both
    -0.20
     neither
    -0.20
     nicht
    -0.20
     not
    -0.19
     tidak
    -0.18
     не
    -0.17
     ikke
    -0.17
     deÄŁil
    -0.17
     nejen
    -0.17
    POSITIVE LOGITS
    whether
    0.20
     whether
    0.18
    Whether
    0.18
     ultimately
    0.17
     WHETHER
    0.17
     Whether
    0.17
    æĺ¯åIJ¦
    0.16
     zda
    0.16
     simple
    0.15
     simply
    0.15
    Act Density 0.068%

    No Known Activations