INDEX
    Explanations

    data analysis

    New Auto-Interp
    Negative Logits
     filmed
    -0.07
    Images
    -0.06
     וגם
    -0.06
     ment
    -0.06
     asym
    -0.06
     cooking
    -0.06
    /react
    -0.06
    owners
    -0.06
    :[
    -0.06
    -0.06
    POSITIVE LOGITS
    0.07
     qi
    0.07
    되었다
    0.07
     بعيد
    0.07
    -serif
    0.07
    0.07
     vem
    0.06
     اللعبة
    0.06
    pth
    0.06
    0.06
    Act Density 0.039%

    No Known Activations