INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     assum
    -0.07
    :auto
    -0.07
    -0.06
    PROFILE
    -0.06
    Manage
    -0.06
     طبق
    -0.06
    -0.06
     Neville
    -0.06
     přip
    -0.06
     wielding
    -0.06
    POSITIVE LOGITS
     Diabetes
    0.07
    emez
    0.07
    	item
    0.07
     sat
    0.06
     healthcare
    0.06
     brother
    0.06
     father
    0.06
     nearest
    0.06
    wan
    0.06
     cleaned
    0.06
    Act Density 0.001%

    No Known Activations