INDEX
    Explanations

    references to specific individuals or their works

    New Auto-Interp
    Negative Logits
    azzi
    -0.17
    indy
    -0.16
    iT
    -0.15
    asmus
    -0.15
    ÙĪØ§Ø¡
    -0.15
    itet
    -0.14
    .ai
    -0.14
    TT
    -0.14
    amarin
    -0.14
    enheim
    -0.14
    POSITIVE LOGITS
    wig
    0.20
    ger
    0.19
     Lad
    0.17
    heck
    0.16
    ève
    0.15
    olph
    0.15
    ouce
    0.15
    -addon
    0.14
    rido
    0.14
     Down
    0.14
    Act Density 0.013%

    No Known Activations