INDEX
    Explanations

    references to the pronoun "it."

    New Auto-Interp
    Negative Logits
    _utilities
    -0.15
    mont
    -0.14
    adas
    -0.14
     olsun
    -0.14
    곡
    -0.14
    578
    -0.14
     sebou
    -0.14
    ause
    -0.14
    tolua
    -0.14
    дина
    -0.14
    POSITIVE LOGITS
    iner
    0.42
    chy
    0.32
    ching
    0.26
    unes
    0.26
    alo
    0.23
    ches
    0.23
    alic
    0.23
    inerary
    0.23
    aly
    0.22
     raining
    0.21
    Act Density 0.540%

    No Known Activations