INDEX
    Explanations

    questions that seek to understand reasons or motivations

    New Auto-Interp
    Negative Logits
    $​
    -0.72
    intahan
    -0.70
     Cortes
    -0.70
     ")[
    -0.67
     Manning
    -0.66
     metallo
    -0.64
    Lass
    -0.64
     CGSize
    -0.64
     DEP
    -0.63
    ovatel
    -0.63
    POSITIVE LOGITS
     why
    1.75
    why
    1.68
     Whyte
    1.61
    Why
    1.58
     Why
    1.55
     WHY
    1.49
    WHY
    1.48
     Waarom
    1.42
     Warum
    1.40
     varför
    1.34
    Act Density 0.059%

    No Known Activations