INDEX
    Explanations

    instances of the word "and" and similar conjunctions

    New Auto-Interp
    Negative Logits
     diss
    -0.07
    elder
    -0.07
    .o
    -0.06
    unter
    -0.06
    exe
    -0.06
    kur
    -0.06
    amat
    -0.06
    ensus
    -0.06
    oyer
    -0.06
    iglia
    -0.05
    POSITIVE LOGITS
    although
    0.07
    éijij
    0.07
    çŀ
    0.07
     although
    0.07
    there
    0.07
     there
    0.06
    icorn
    0.06
    ithub
    0.06
    this
    0.06
    worthy
    0.06
    Act Density 0.093%

    No Known Activations