INDEX
    Explanations

    conditional phrases indicating uncertainty or inquiry

    New Auto-Interp
    Negative Logits
    ä¸įäºĨ
    -0.20
     Neither
    -0.20
     neither
    -0.19
     keine
    -0.17
     nowhere
    -0.16
     keinen
    -0.16
    Neither
    -0.16
    anst
    -0.16
    =no
    -0.16
     NEVER
    -0.16
    POSITIVE LOGITS
    /how
    0.51
     there
    0.31
     indeed
    0.30
     anyone
    0.26
     anybody
    0.24
     any
    0.23
     maybe
    0.23
    /if
    0.22
     anything
    0.21
    there
    0.21
    Act Density 0.066%

    No Known Activations