INDEX
    Explanations

    references to books and their authors

    New Auto-Interp
    Negative Logits
    enville
    -0.15
    éĦ
    -0.15
    oter
    -0.15
    iscrim
    -0.14
    ayne
    -0.14
     Gel
    -0.14
    ys
    -0.14
    chy
    -0.14
    esel
    -0.14
    jango
    -0.13
    POSITIVE LOGITS
    agrid
    0.16
    ÏĨι
    0.15
    heimer
    0.15
    Ãłng
    0.15
    ardım
    0.14
    iset
    0.14
     Piet
    0.14
    æĮ¯ãĤĬ
    0.14
     Feinstein
    0.14
    uchos
    0.14
    Act Density 0.106%

    No Known Activations