INDEX
    Explanations

    references to the word "what" in various contexts

    what followed by pronouns or verbs

    New Auto-Interp
    Negative Logits
     Verge
    -0.56
    ☆☆
    -0.55
    AMC
    -0.54
    mez
    -0.54
     مشين
    -0.52
    str
    -0.52
    liez
    -0.52
    erl
    -0.50
    P
    -0.50
    oucí
    -0.50
    POSITIVE LOGITS
     WHAT
    1.04
    WHAT
    0.98
     what
    0.98
     What
    0.95
    what
    0.91
    What
    0.88
    ArgsConstructor
    0.79
     happened
    0.78
     happens
    0.77
     <=",
    0.74
    Act Density 0.127%

    No Known Activations