INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Велик
    -0.07
     ministry
    -0.07
    -тех
    -0.06
     Sand
    -0.06
     KUR
    -0.06
    training
    -0.06
    cov
    -0.06
    ashed
    -0.06
         	
    -0.06
     середови
    -0.06
    POSITIVE LOGITS
     đến
    0.07
    ONA
    0.07
    thr
    0.07
     advertisers
    0.06
    datos
    0.06
     [...]↵↵
    0.06
    Pers
    0.06
    ton
    0.06
    SPA
    0.06
     pa
    0.06
    Act Density 0.006%

    No Known Activations