INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    เฟ
    -0.08
          
    -0.07
     реп
    -0.07
    388
    -0.06
     AS
    -0.06
     luxe
    -0.06
     gc
    -0.06
     ric
    -0.06
     wła
    -0.06
    Backup
    -0.06
    POSITIVE LOGITS
     영향
    0.06
    0.06
    Object
    0.06
     görmek
    0.06
    fbe
    0.06
     physics
    0.06
    0.06
    수가
    0.06
    Ann
    0.06
    heets
    0.06
    Act Density 0.503%

    No Known Activations