INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Kitt
    -0.08
     trabaj
    -0.07
     Km
    -0.07
     manner
    -0.07
     اج
    -0.06
     veya
    -0.06
     ir
    -0.06
            
    -0.06
    -enh
    -0.06
     cuốn
    -0.06
    POSITIVE LOGITS
    Whole
    0.06
     başta
    0.06
     everybody
    0.06
     everyone
    0.06
     atlas
    0.06
    OUNTER
    0.06
    ớt
    0.06
    .Submit
    0.06
    tensorflow
    0.06
     []),↵
    0.06
    Act Density 0.018%

    No Known Activations