INDEX
    Explanations

    expressions of hesitation or uncertainty

    New Auto-Interp
    Negative Logits
    celik
    -0.18
    isay
    -0.16
    yonel
    -0.16
    اط
    -0.15
    ially
    -0.15
    @nate
    -0.15
    edBy
    -0.15
    eway
    -0.15
     [â̦]↵↵
    -0.14
    alama
    -0.14
    POSITIVE LOGITS
     well
    0.29
     um
    0.27
     er
    0.27
     uh
    0.24
    well
    0.23
     shall
    0.23
     ah
    0.22
    shall
    0.21
     wait
    0.21
     erm
    0.20
    Act Density 0.087%

    No Known Activations