INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ''↵↵
    -0.07
    female
    -0.06
     uno
    -0.06
     vou
    -0.06
     come
    -0.06
    ==$
    -0.06
     nigeria
    -0.06
    Compose
    -0.06
    ्रम
    -0.06
    -0.06
    POSITIVE LOGITS
     further
    0.08
     Participant
    0.07
     Screening
    0.06
     Executive
    0.06
    0.06
    _color
    0.06
     Leafs
    0.06
     disag
    0.06
    pais
    0.06
    States
    0.06
    Act Density 0.005%

    No Known Activations