INDEX
    Explanations

    expressions of uncertainty or questions about knowledge

    I don't know followed by uncertainty

    New Auto-Interp
    Negative Logits
    complexContent
    -0.42
    posedge
    -0.38
     autorytatywna
    -0.35
    ьаж
    -0.35
    ScopeManager
    -0.34
    iální
    -0.34
     ensured
    -0.33
     voul
    -0.33
     tartalomajánló
    -0.32
     volon
    -0.32
    POSITIVE LOGITS
     dunno
    0.93
    Dunno
    0.90
    IDK
    0.82
     idk
    0.81
     Idk
    0.80
    idk
    0.72
    Idk
    0.70
    不知道
    0.65
    我不知道
    0.65
     unknowns
    0.64
    Act Density 0.012%

    No Known Activations