INDEX
    Explanations

    say [opening brackets]

    New Auto-Interp
    Negative Logits
     concept
    -0.07
    -0.07
     condi
    -0.07
     Ih
    -0.07
    اطعة
    -0.06
     expertise
    -0.06
     commenced
    -0.06
     Ideas
    -0.06
     Clair
    -0.06
    -0.06
    POSITIVE LOGITS
    0.07
    Tit
    0.06
    ças
    0.06
     oppon
    0.06
     yüzyıl
    0.06
    ocl
    0.06
    =en
    0.06
    Refs
    0.06
    Wenn
    0.06
     còn
    0.06
    Act Density 0.050%

    No Known Activations