INDEX
    Explanations

    mentions of historical figures and scholarly works

    New Auto-Interp
    Negative Logits
    aison
    -0.15
    ÙĬاÙĨ
    -0.15
     Schmidt
    -0.14
    ekil
    -0.14
    ương
    -0.14
    suit
    -0.13
    ONO
    -0.13
    ylene
    -0.13
    roadcast
    -0.13
    osate
    -0.13
    POSITIVE LOGITS
     et
    0.21
     writing
    0.17
     wrote
    0.16
     SND
    0.16
     argument
    0.15
    (ed
    0.15
     interviewed
    0.15
     ed
    0.15
     grounding
    0.14
     write
    0.14
    Act Density 0.166%

    No Known Activations