INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     говор
    -0.07
     cuis
    -0.07
     forgot
    -0.06
    .liferay
    -0.06
     revolves
    -0.06
     aspiring
    -0.06
    gable
    -0.06
    ενοδο
    -0.06
     doubtful
    -0.06
    ��
    -0.06
    POSITIVE LOGITS
    ektor
    0.07
     Examiner
    0.07
    veç
    0.06
     args
    0.06
     oxygen
    0.06
    upp
    0.06
    Storm
    0.06
     OCC
    0.06
     Harden
    0.06
    	properties
    0.06
    Act Density 0.015%

    No Known Activations