INDEX
    Explanations

    words related to names and specific terms, possibly related to political figures or events

    tokens related to names or identifiers, particularly focused on a specific character or entity throughout the text

    New Auto-Interp
    Negative Logits
    ahime
    -0.72
    kamp
    -0.61
    nw
    -0.60
     holiest
    -0.56
     Franch
    -0.56
     tresp
    -0.54
    jong
    -0.54
     developing
    -0.51
    ung
    -0.51
    eton
    -0.51
    POSITIVE LOGITS
    dden
    0.68
    avis
    0.67
    ody
    0.66
    Marginal
    0.64
    Pick
    0.62
    atche
    0.62
    icz
    0.62
    ĵĺ
    0.61
    iversal
    0.60
    anyl
    0.60
    Act Density 0.532%

    No Known Activations