INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Gill
    -0.09
    -0.09
     compensate
    -0.09
    pog
    -0.08
     summarizes
    -0.08
    761
    -0.08
     حسب
    -0.07
     Pog
    -0.07
    -0.07
     confess
    -0.07
    POSITIVE LOGITS
     Wes
    0.09
     Van
    0.09
     Wend
    0.08
    LAND
    0.08
     cac
    0.08
     cas
    0.07
     Cli
    0.07
     Willow
    0.07
     те
    0.07
     dispon
    0.07
    Act Density 0.019%

    No Known Activations