INDEX
    Explanations

    phrases that emphasize possession and personal connection

    New Auto-Interp
    Negative Logits
    jan
    -0.15
    orz
    -0.15
     famed
    -0.14
    gel
    -0.14
    ties
    -0.14
    awn
    -0.14
    äm
    -0.14
     parent
    -0.14
    ALES
    -0.13
    ebb
    -0.13
    POSITIVE LOGITS
    desired
    0.18
     choice
    0.16
     desired
    0.15
    íĭĢ
    0.15
    venes
    0.15
     wil
    0.15
    avin
    0.15
    vest
    0.15
    onne
    0.14
    Ñıз
    0.14
    Act Density 0.169%

    No Known Activations