INDEX
    Explanations

    instances of the pronoun "we" in various contexts

    New Auto-Interp
    Negative Logits
    rze
    -0.16
    gni
    -0.15
    ieri
    -0.15
    omo
    -0.15
    -valu
    -0.15
    arshal
    -0.15
    rosso
    -0.14
    lector
    -0.14
     mosaic
    -0.14
    rippling
    -0.14
    POSITIVE LOGITS
    acer
    0.14
    666
    0.14
    Ø·ÙĨ
    0.14
    316
    0.14
    456
    0.14
    jev
    0.13
    hek
    0.13
    irting
    0.13
    ãĤ¡
    0.13
     èĹ
    0.13
    Act Density 0.055%

    No Known Activations