INDEX
    Explanations

    expressions of surprise or emphasis

    New Auto-Interp
    Negative Logits
     myſelf
    -1.10
     itſelf
    -0.98
     Efq
    -0.95
     useRouter
    -0.94
     houſe
    -0.89
     themſelves
    -0.87
     himſelf
    -0.85
     Houſe
    -0.85
     ſeveral
    -0.78
     againſt
    -0.78
    POSITIVE LOGITS
     Oh
    1.19
    Oh
    1.11
     oh
    1.00
    oh
    0.99
     OH
    0.92
    OH
    0.83
    Ohh
    0.81
     sweet
    0.78
    ็จ
    0.76
    Ohhhh
    0.76
    Act Density 0.061%

    No Known Activations