INDEX
    Explanations

    questions beginning with "why" or "how."

    New Auto-Interp
    Negative Logits
    NUMX
    -0.58
    Поэтому
    -0.49
     sécher
    -0.44
    קישורים
    -0.43
    {}".
    -0.42
    oof
    -0.42
    escrit
    -0.41
    {}'.
    -0.40
    elif
    -0.40
    aarrggbb
    -0.39
    POSITIVE LOGITS
     How
    0.92
     Why
    0.89
    How
    0.88
     What
    0.84
    Why
    0.84
    What
    0.81
     Who
    0.80
    Who
    0.76
     ¿
    0.73
    Which
    0.71
    Act Density 0.227%

    No Known Activations