INDEX
    Explanations

    phrases that express personal experiences or memories

    New Auto-Interp
    Negative Logits
    roduced
    -0.15
    linger
    -0.15
    à¹Ģ
    -0.14
    alian
    -0.14
    ãĥªãĤ«
    -0.14
     given
    -0.14
     cabo
    -0.14
    sent
    -0.13
    ormal
    -0.13
     breeze
    -0.13
    POSITIVE LOGITS
     saw
    0.35
     experienced
    0.34
     heard
    0.33
     Saw
    0.30
     witnessed
    0.29
     observed
    0.29
    heard
    0.28
     Heard
    0.26
     caught
    0.25
    seen
    0.25
    Act Density 0.202%

    No Known Activations