INDEX
    Explanations

    exclamations or interjections followed by conjunctions

    expressions of surprise or emphasis

    New Auto-Interp
    Negative Logits
    Roaming
    -0.78
    -+-+
    -0.78
    20439
    -0.77
    BILITIES
    -0.76
    enta
    -0.74
    IRE
    -0.74
    Ö¼
    -0.73
    ngth
    -0.73
    İĭ
    -0.72
    ERG
    -0.72
    POSITIVE LOGITS
     yeah
    1.15
     yes
    1.00
     wait
    0.99
     yea
    0.94
     sorry
    0.93
     wow
    0.92
     hello
    0.90
     dear
    0.88
     hey
    0.87
     goodness
    0.83
    Act Density 0.034%

    No Known Activations