INDEX
    Explanations

    reflexive pronouns in various contexts

    New Auto-Interp
    Negative Logits
    icip
    -0.17
    ož
    -0.15
     cole
    -0.15
    ecute
    -0.14
    oes
    -0.14
    antu
    -0.14
    ừ
    -0.14
    _topology
    -0.13
    icipation
    -0.13
     erle
    -0.13
    POSITIVE LOGITS
    uls
    0.22
    conde
    0.20
    ared
    0.19
    ules
    0.19
    ãĥ³ãĥĩãĤ£
    0.16
    ign
    0.16
     content
    0.16
     Hutchinson
    0.16
    vez
    0.15
    enth
    0.15
    Act Density 0.004%

    No Known Activations