INDEX
    Explanations

    breakdown followed by categorization

    New Auto-Interp
    Negative Logits
     implementar
    0.64
    מה
    0.63
     відповідно
    0.63
     pourtant
    0.62
     tasked
    0.61
     implement
    0.60
     плану
    0.60
     solides
    0.59
    shrink
    0.59
     complex
    0.59
    POSITIVE LOGITS
    categor
    1.26
     category
    1.23
     categorized
    1.23
     categor
    1.16
     Categor
    1.14
     classified
    1.07
     categorie
    1.07
     categorization
    1.06
     CATEG
    1.05
    Categor
    1.04
    Act Density 0.303%

    No Known Activations