INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     imageData
    -0.07
    pun
    -0.06
    usb
    -0.06
     starred
    -0.06
     sentimental
    -0.06
     hemorrh
    -0.06
     recru
    -0.06
     файл
    -0.06
     jclass
    -0.06
     Dra
    -0.06
    POSITIVE LOGITS
    disp
    0.06
     function
    0.06
    对于
    0.06
    CORE
    0.06
    тий
    0.06
    0.06
    Emer
    0.06
     مهم
    0.06
    ,在
    0.06
     moderately
    0.06
    Act Density 0.061%

    No Known Activations