INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    candidates
    -0.06
    ادی
    -0.06
    ccc
    -0.06
    ibold
    -0.06
     supporters
    -0.06
     Word
    -0.06
    ijo
    -0.06
     Thomson
    -0.06
     absolute
    -0.06
    Start
    -0.06
    POSITIVE LOGITS
     mane
    0.07
     Again
    0.07
     ضربه
    0.07
     رنگ
    0.06
     плав
    0.06
    atego
    0.06
    istinguish
    0.06
     Fon
    0.06
    ρθρο
    0.06
    _wr
    0.06
    Act Density 0.122%

    No Known Activations