INDEX
    Explanations

    negation and expressions of uncertainty or doubt

    New Auto-Interp
    Negative Logits
    bas
    -0.15
     оÑĤÑģÑĥÑĤÑģÑĤв
    -0.14
    itti
    -0.14
     ONLY
    -0.14
     Missing
    -0.14
     Cone
    -0.14
     missing
    -0.14
    jan
    -0.14
     basically
    -0.13
     saja
    -0.13
    POSITIVE LOGITS
     strictly
    0.21
     caring
    0.19
    strict
    0.19
     TOUCH
    0.19
     yet
    0.19
    Touch
    0.18
     guar
    0.18
     touch
    0.18
     Touch
    0.18
     necessarily
    0.18
    Act Density 0.188%

    No Known Activations