INDEX
    Explanations

    perpetrator

    New Auto-Interp
    Negative Logits
     provided
    -0.08
     تقييم
    -0.08
     اندازه
    -0.08
     مهمة
    -0.08
     beoordeling
    -0.08
     좋아
    -0.08
     معیار
    -0.08
     observational
    -0.07
    timeouts
    -0.07
    -0.07
    POSITIVE LOGITS
     conspir
    0.10
     perpetrators
    0.09
     guilty
    0.09
     disguised
    0.09
     concealed
    0.09
     perpetr
    0.09
     винов
    0.09
     culprit
    0.08
     clandest
    0.08
     cunning
    0.08
    Act Density 0.015%

    No Known Activations