INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    itize
    -0.08
     Orte
    -0.08
     بالقرب
    -0.08
    Chunks
    -0.08
     Henry
    -0.08
     courtesy
    -0.08
     contestants
    -0.08
     heats
    -0.07
    ుల
    -0.07
     enclosure
    -0.07
    POSITIVE LOGITS
    masters
    0.09
     mastery
    0.08
    classmethod
    0.08
    alliative
    0.08
     практика
    0.08
     lifelong
    0.08
     praxis
    0.08
     empowered
    0.08
     empowerment
    0.08
     sentencia
    0.08
    Act Density 0.013%

    No Known Activations