INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	dfs
    -0.07
     jp
    -0.07
    -0.07
    atat
    -0.07
     sung
    -0.07
     inoc
    -0.07
    -0.07
    התחלה
    -0.07
    ϭ
    -0.07
     essa
    -0.06
    POSITIVE LOGITS
    _IMAGE
    0.07
    _tid
    0.07
    0.07
     López
    0.07
    ー�
    0.07
    _IP
    0.07
    0.07
    .GroupBox
    0.07
    erokee
    0.07
     surprised
    0.06
    Act Density 0.026%

    No Known Activations