INDEX
    Explanations

    phrases indicating uncertainty or indecision

    New Auto-Interp
    Negative Logits
    PhysRevLett
    -0.68
    tonsoft
    -0.61
    互联网档案馆
    -0.59
    jewództ
    -0.59
    ,**
    -0.57
    migrationBuilder
    -0.56
    الحياه
    -0.56
    ſelves
    -0.55
     föruts
    -0.55
     ')[
    -0.55
    POSITIVE LOGITS
     yet
    1.12
    yet
    0.89
     Yet
    0.79
    Yet
    0.76
     hasn
    0.71
     chưa
    0.71
    还没
    0.69
     YET
    0.69
     todavía
    0.67
     haven
    0.65
    Act Density 0.295%

    No Known Activations