INDEX
    Explanations

    medical terms, thinking, and foreign words

    New Auto-Interp
    Negative Logits
     razvoja
    0.53
     seiner
    0.46
     cercando
    0.46
     towards
    0.46
     yanında
    0.46
     одним
    0.46
    ;.
    0.45
     offset
    0.45
    т
    0.44
     одному
    0.44
    POSITIVE LOGITS
     మాత్ర
    0.48
    Interview
    0.43
    Drop
    0.43
     hank
    0.42
     बाघ
    0.42
    legg
    0.41
     జ్ఞాప
    0.41
    除去
    0.41
     Drop
    0.40
     টাইগার
    0.40
    Act Density 0.002%

    No Known Activations