INDEX
    Explanations

    questions starting with "Which" followed by a verb

    the word "which" indicating questions or clarifications

    New Auto-Interp
    Negative Logits
    Gy
    -0.68
    Bas
    -0.68
    Rog
    -0.67
    gy
    -0.67
    GROUND
    -0.65
    bug
    -0.65
    GY
    -0.64
    mob
    -0.64
    Bo
    -0.63
    BLE
    -0.62
    POSITIVE LOGITS
    soever
    0.88
     brings
    0.82
     surprises
    0.76
    xual
    0.75
    ãĥ¯ãĥ³
    0.74
     begs
    0.72
    espie
    0.68
     contrasts
    0.64
    ño
    0.63
    eele
    0.63
    Act Density 0.085%

    No Known Activations