INDEX
    Explanations

    sentences that question or analyze concepts and their implications

    New Auto-Interp
    Negative Logits
    lage
    -0.17
    deaux
    -0.15
     stride
    -0.14
    aga
    -0.13
     Sort
    -0.13
     Äijâu
    -0.13
    vek
    -0.13
    let
    -0.13
    Łèĥ½
    -0.13
    ufe
    -0.13
    POSITIVE LOGITS
     ph
    0.51
     put
    0.41
     Put
    0.37
     stated
    0.33
     Ph
    0.32
     PUT
    0.32
     puts
    0.32
    .put
    0.32
    Put
    0.31
    put
    0.31
    Act Density 0.173%

    No Known Activations