INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     elector
    -0.08
    Verb
    -0.08
    Conc
    -0.07
     delaying
    -0.07
    .conc
    -0.07
     wahr
    -0.07
    .Ver
    -0.07
    .Private
    -0.07
     والمن
    -0.07
    persistent
    -0.07
    POSITIVE LOGITS
     আন
    0.08
     dx
    0.08
     সূত্র
    0.07
     нес
    0.07
    istoire
    0.07
     প্রত
    0.07
     %{
    0.07
     Jacob
    0.07
    inct
    0.07
    手续
    0.07
    Act Density 0.005%

    No Known Activations