INDEX
    Explanations

    demonstrative pronouns

    New Auto-Interp
    Negative Logits
    -0.06
     chromium
    -0.06
     poured
    -0.06
     racing
    -0.06
     Letters
    -0.06
     consort
    -0.06
     Persistence
    -0.06
     Cr
    -0.06
     descended
    -0.06
     freeway
    -0.06
    POSITIVE LOGITS
    ,↵↵↵↵
    0.07
    0.07
    δη
    0.06
    0.06
    executor
    0.06
    .mime
    0.06
    ональ
    0.06
    "};
    ↵
    0.06
    shi
    0.06
    نت
    0.06
    Act Density 0.031%

    No Known Activations