INDEX
    Explanations

    questions and inquiries related to understanding or learning more about a topic

    New Auto-Interp
    Negative Logits
    3
    -0.19
    4
    -0.18
    2
    -0.18
    5
    -0.17
    6
    -0.17
    1
    -0.17
    8
    -0.17
    11
    -0.16
     ourselves
    -0.16
    9
    -0.15
    POSITIVE LOGITS
     yo
    0.43
     yu
    0.43
     you
    0.40
     u
    0.36
     y
    0.36
     tou
    0.36
     ou
    0.34
     ya
    0.32
     yp
    0.32
     Ñĥ
    0.30
    Act Density 0.229%

    No Known Activations