INDEX
    Explanations

    expressions of confusion or difficulty in understanding concepts

    New Auto-Interp
    Negative Logits
     поÑģÑĤоÑıнно
    -0.24
     always
    -0.21
     constantly
    -0.21
     siempre
    -0.20
     sempre
    -0.20
     вÑģегда
    -0.19
     toujours
    -0.19
    always
    -0.19
     now
    -0.18
    ä¸Ģ缴
    -0.18
    POSITIVE LOGITS
     simply
    0.24
     downright
    0.22
     even
    0.21
     sometimes
    0.19
    Sometimes
    0.19
    çĶļèĩ³
    0.19
    even
    0.18
     Simply
    0.18
     depending
    0.18
     sogar
    0.18
    Act Density 0.236%

    No Known Activations