INDEX
    Explanations

    references to "it" as a subject or object

    New Auto-Interp
    Negative Logits
    ÙĨدÙĩ
    -0.18
    åĩºåĵģèĢħ
    -0.15
    rq
    -0.15
    rud
    -0.15
    ylum
    -0.15
    maries
    -0.15
    edList
    -0.15
    lename
    -0.14
    ↵↵
    -0.14
    elpers
    -0.14
    POSITIVE LOGITS
    iner
    0.38
    unes
    0.29
    SELF
    0.25
    self
    0.24
    chy
    0.24
    ches
    0.22
    ty
    0.22
    alien
    0.21
     its
    0.21
    aly
    0.20
    Act Density 0.139%

    No Known Activations