INDEX
    Explanations

    references to specific individuals or notable figures

    New Auto-Interp
    Negative Logits
    amac
    -0.15
    θο
    -0.14
     Bureau
    -0.14
    oker
    -0.14
    acular
    -0.13
    à«
    -0.13
    ncpy
    -0.13
     Ej
    -0.13
    owie
    -0.13
     Taj
    -0.13
    POSITIVE LOGITS
    esch
    0.33
    eme
    0.31
    leich
    0.31
    ew
    0.31
    lied
    0.28
    eden
    0.28
    eb
    0.27
    egen
    0.26
    ottes
    0.26
    es
    0.25
    Act Density 0.011%

    No Known Activations