INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Unit
    -0.08
    ReturnType
    -0.07
     homes
    -0.07
    ustom
    -0.07
    aptop
    -0.07
    homes
    -0.07
    	vertex
    -0.07
     عملية
    -0.07
    くの
    -0.07
    ňují
    -0.07
    POSITIVE LOGITS
     say
    0.08
     saying
    0.08
    handleSubmit
    0.06
     statement
    0.06
     Rece
    0.06
    .say
    0.06
     yetiştir
    0.06
    ossed
    0.06
    ategic
    0.06
     trä
    0.06
    Act Density 0.015%

    No Known Activations