INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
    grams
    -0.07
    CHANT
    -0.07
    vette
    -0.07
    Tem
    -0.06
    GRAM
    -0.06
    Emp
    -0.06
     mor
    -0.06
    	GUI
    -0.06
    REMOTE
    -0.06
    POSITIVE LOGITS
     Angry
    0.06
     большой
    0.06
    roat
    0.06
     mour
    0.06
    ategorical
    0.06
     داشت
    0.06
     draws
    0.06
    ,err
    0.06
     ngờ
    0.06
     elektronik
    0.06
    Act Density 0.004%

    No Known Activations