INDEX
    Explanations

    pronouns and their usage in various contexts

    New Auto-Interp
    Negative Logits
    oui
    -0.16
     Ngh
    -0.14
    ring
    -0.14
     note
    -0.14
     Holl
    -0.14
    imb
    -0.13
     gener
    -0.13
    atti
    -0.13
    åº
    -0.13
    ilian
    -0.13
    POSITIVE LOGITS
    umbn
    0.17
    ertino
    0.16
    eldorf
    0.15
    prite
    0.15
    UnderTest
    0.15
    ưỡng
    0.15
    uzzi
    0.14
    มà¸Ļ
    0.14
    byn
    0.14
    ircon
    0.14
    Act Density 0.437%

    No Known Activations