INDEX
    Explanations

    terms and phrases indicating nobility or aristocracy

    New Auto-Interp
    Negative Logits
    icked
    -0.18
    acen
    -0.17
    agh
    -0.15
     Fritz
    -0.15
    eldon
    -0.15
    egis
    -0.15
    ocab
    -0.15
    cı
    -0.15
    apper
    -0.15
    igma
    -0.14
    POSITIVE LOGITS
    ility
    0.27
    les
    0.27
    lemen
    0.26
    odies
    0.25
    LES
    0.21
    bler
    0.20
    ilities
    0.20
    iliary
    0.19
    bery
    0.18
    ILITY
    0.18
    Act Density 0.007%

    No Known Activations