INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     coefficient
    -0.07
     politically
    -0.07
    temporary
    -0.07
    -0.07
     hypertension
    -0.06
     workspace
    -0.06
     =============================================================================↵
    -0.06
    ########################################################
    -0.06
     kênh
    -0.06
    POSITIVE LOGITS
    gré
    0.07
     şiddet
    0.06
    sembled
    0.06
     veloc
    0.06
     terug
    0.06
     vie
    0.06
     zwarte
    0.06
     laughed
    0.06
    „J
    0.06
     gorge
    0.06
    Act Density 0.067%

    No Known Activations