INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     _)
    -0.08
    Exist
    -0.08
    Osc
    -0.07
     wyposaż
    -0.07
    .Values
    -0.07
     lze
    -0.07
     existem
    -0.07
    Aut
    -0.07
     mentality
    -0.07
    -0.07
    POSITIVE LOGITS
     concise
    0.10
     পরিষ
    0.10
     crisp
    0.09
     clarity
    0.09
    0.08
    0.08
     readability
    0.08
     clara
    0.08
     exposition
    0.08
     confusion
    0.08
    Act Density 0.018%

    No Known Activations