INDEX
    Explanations

    expressions of curiosity or doubt

    followed by questions

    expressing wonder or curiosity

    New Auto-Interp
    Negative Logits
    AntiForgeryToken
    -0.66
    ifié
    -0.60
    bigliamento
    -0.57
     Astoria
    -0.57
    bebasan
    -0.56
     Vass
    -0.56
    aronne
    -0.55
    🔥🔥
    -0.54
    oinette
    -0.53
     отношению
    -0.53
    POSITIVE LOGITS
     wonder
    1.06
     wondering
    1.02
    wonder
    1.02
     WONDER
    0.93
     wondered
    0.93
     Wonder
    0.91
     doubt
    0.87
    Wondering
    0.86
    Wonder
    0.86
     why
    0.78
    Act Density 0.067%

    No Known Activations