INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     But
    -0.08
     Or
    -0.07
     heer
    -0.07
     indivíduo
    -0.07
    หลัก
    -0.07
    יי�
    -0.07
    -0.07
    COMMENT
    -0.07
     diagnosis
    -0.07
    -0.06
    POSITIVE LOGITS
     recept
    0.10
    ноп
    0.09
     detaine
    0.09
     grate
    0.09
     руки
    0.08
     handing
    0.08
     grips
    0.08
     Grip
    0.08
     grabs
    0.08
     Receiver
    0.08
    Act Density 0.022%

    No Known Activations