INDEX
    Explanations

    introspection and questioning

    New Auto-Interp
    Negative Logits
    0.35
     they
    0.32
     provides
    0.31
    اں
    0.30
     plated
    0.30
     contains
    0.29
     pathways
    0.29
     the
    0.28
     streamlined
    0.28
     hubo
    0.28
    POSITIVE LOGITS
     nghĩ
    0.39
     misog
    0.38
    Ironically
    0.38
    我现在
    0.37
     ощущение
    0.37
     философ
    0.35
    质疑
    0.35
     şunu
    0.35
     bertanya
    0.34
     लक्षात
    0.34
    Act Density 0.081%

    No Known Activations