INDEX
    Explanations

    proper nouns, particularly names and titles

    New Auto-Interp
    Negative Logits
    orama
    -0.19
    opi
    -0.18
    fram
    -0.16
    ften
    -0.15
     McKenzie
    -0.15
     bc
    -0.15
    emax
    -0.15
    illum
    -0.15
    apon
    -0.15
    å´İ
    -0.15
    POSITIVE LOGITS
     Couch
    0.16
    ÑĦиÑĨи
    0.15
    _simps
    0.15
     Tent
    0.14
    -UA
    0.14
    ueva
    0.14
    anky
    0.14
    -fi
    0.14
    hold
    0.14
    ugi
    0.14
    Act Density 0.102%

    No Known Activations