INDEX
    Explanations

    expressions of surprise or exclamation

    New Auto-Interp
    Negative Logits
     useRouter
    -0.89
     myſelf
    -0.86
     itſelf
    -0.74
     themſelves
    -0.70
    himself
    -0.68
     himſelf
    -0.66
    )」
    -0.64
    "]}
    -0.62
    herself
    -0.61
     himself
    -0.60
    POSITIVE LOGITS
     Oh
    1.89
    Oh
    1.78
     oh
    1.71
    oh
    1.57
     OH
    1.47
    OH
    1.32
    Ohh
    1.28
    Ohhh
    1.21
     Ooh
    1.18
    Ohhhh
    1.14
    Act Density 0.044%

    No Known Activations