INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     rendering
    -0.07
     rainfall
    -0.07
    reement
    -0.07
     Rendering
    -0.06
     Actor
    -0.06
     permite
    -0.06
     God
    -0.06
     Monument
    -0.06
     controversy
    -0.06
    Pressure
    -0.06
    POSITIVE LOGITS
     have
    0.11
     has
    0.09
     had
    0.08
     Has
    0.07
    ρή
    0.07
    0.07
    /control
    0.06
     kidnapped
    0.06
     خام
    0.06
    've
    0.06
    Act Density 0.034%

    No Known Activations