INDEX
    Explanations

    references to gratitude and appreciation

    New Auto-Interp
    Negative Logits
     moschino
    -0.89
     Himo
    -0.84
    ########.
    -0.84
    Hochspringen
    -0.82
     Seeder
    -0.81
    الحياه
    -0.79
     Surname
    -0.78
     margiela
    -0.77
     imagui
    -0.77
    <unused71>
    -0.77
    POSITIVE LOGITS
    <eos>
    0.84
    ↵↵
    0.73
     The
    0.72
    0.66
    0.60
     ...
    0.60
    ↵↵↵
    0.56
     …
    0.55
     A
    0.55
     Do
    0.55
    Act Density 0.308%

    No Known Activations