INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    tuk
    -0.06
     Betty
    -0.06
     Names
    -0.06
     humor
    -0.06
     burden
    -0.06
     seperate
    -0.06
    ρκ
    -0.06
    -0.06
    riday
    -0.06
     Story
    -0.06
    POSITIVE LOGITS
     од
    0.07
     forState
    0.07
    alesce
    0.06
     نظام
    0.06
     ابزار
    0.06
    dropout
    0.06
    _APPLICATION
    0.06
    하시
    0.06
    CVE
    0.06
     İzmir
    0.06
    Act Density 0.008%

    No Known Activations