INDEX
    Explanations

    references to ownership and possession

    New Auto-Interp
    Negative Logits
    uard
    -0.17
    839
    -0.16
     lib
    -0.15
    acias
    -0.15
    rl
    -0.15
    รà¸ĵ
    -0.14
     Lib
    -0.14
    ck
    -0.14
    ĥĿ
    -0.14
    ude
    -0.14
    POSITIVE LOGITS
    æºĢ
    0.15
    òi
    0.15
    werk
    0.15
    нÑıв
    0.15
     conv
    0.14
    uppe
    0.14
    esub
    0.14
     unt
    0.14
    ADDE
    0.14
    thouse
    0.14
    Act Density 0.023%

    No Known Activations