INDEX
    Explanations

    organizations/groups

    New Auto-Interp
    Negative Logits
     IMDb
    -0.07
     Contribution
    -0.07
     bridge
    -0.07
     Change
    -0.06
    алася
    -0.06
     seront
    -0.06
     quote
    -0.06
     Personnel
    -0.06
    очь
    -0.06
    stakes
    -0.06
    POSITIVE LOGITS
     flexDirection
    0.07
    	cs
    0.06
     ESL
    0.06
     concussion
    0.06
     تصميم
    0.06
    ˆ
    0.06
     throm
    0.06
    0.06
    .va
    0.06
    .deg
    0.06
    Act Density 0.010%

    No Known Activations