INDEX
    Explanations

    expressions of gratitude and acknowledgment

    New Auto-Interp
    Negative Logits
     â̦
    -0.19
     ..
    -0.17
     â̦.
    -0.16
    Ìģ
    -0.16
    -0.15
     ↵↵↵
    -0.15
     ↵↵↵↵
    -0.15
     ↵ ↵
    -0.15
     [â̦]
    -0.15
     Fucking
    -0.15
    POSITIVE LOGITS
    (ph
    0.27
     sort
    0.23
     kind
    0.23
     -
    0.22
    quote
    0.19
    sort
    0.19
     -,
    0.19
    kind
    0.18
     -↵
    0.17
     quote
    0.17
    Act Density 0.013%

    No Known Activations