INDEX
    Explanations

    proper nouns, particularly names of authors and works

    New Auto-Interp
    Negative Logits
    cury
    -0.16
     warp
    -0.15
     played
    -0.15
    ustil
    -0.15
     podp
    -0.14
    partment
    -0.13
    051
    -0.13
    occo
    -0.13
    amina
    -0.13
    igest
    -0.13
    POSITIVE LOGITS
    Writes
    0.15
     Author
    0.14
    writes
    0.14
    olkien
    0.14
    åº
    0.14
    ãĥ«ãĥĪ
    0.14
     onBind
    0.13
    æ´¾
    0.13
     Zus
    0.13
    ÙĪÛĮÛĮ
    0.13
    Act Density 0.288%

    No Known Activations