INDEX
    Explanations

    phrases that express certainty and surprise regarding various contexts

    New Auto-Interp
    Negative Logits
    zzle
    -0.16
    νÏİ
    -0.15
    isses
    -0.15
    оÑĢÑĤÑĥ
    -0.15
    undle
    -0.15
    åį
    -0.15
    izik
    -0.14
    addock
    -0.14
    owitz
    -0.14
    ÑģÑĤи
    -0.14
    POSITIVE LOGITS
     wonder
    0.56
     Wonder
    0.40
     wondered
    0.33
     surprise
    0.31
    Wonder
    0.31
     wonders
    0.30
     wondering
    0.30
     unsur
    0.30
    sur
    0.26
    onder
    0.26
    Act Density 0.099%

    No Known Activations