INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     gob
    -0.07
    .......
    -0.06
    ampoo
    -0.06
     rid
    -0.06
     eclipse
    -0.06
     conv
    -0.06
    ored
    -0.06
     dumb
    -0.06
     الاح
    -0.06
     jou
    -0.06
    POSITIVE LOGITS
    gene
    0.22
    genes
    0.09
    gen
    0.07
    ene
    0.07
     mają
    0.07
     formulaire
    0.07
    .preferences
    0.07
    gne
    0.06
    0.06
    profession
    0.06
    Act Density 0.001%

    No Known Activations