INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     kidn
    -0.09
     fisc
    -0.08
    -buy
    -0.08
    -dismiss
    -0.08
     gym
    -0.07
     économiques
    -0.07
     puppy
    -0.07
     економ
    -0.07
     econôm
    -0.07
     buy
    -0.07
    POSITIVE LOGITS
    Matcher
    0.08
     legality
    0.08
     matcher
    0.08
     Africans
    0.08
     Walsh
    0.07
     aho
    0.07
     permis
    0.07
    Modulo
    0.07
     asynchronously
    0.07
     forb
    0.07
    Act Density 0.002%

    No Known Activations