INDEX
    Explanations

    references to the word 'it' in various contexts

    New Auto-Interp
    Negative Logits
     themselves
    -0.17
    ection
    -0.17
    oled
    -0.15
    ington
    -0.15
    aight
    -0.14
    åĿĩ
    -0.14
     their
    -0.14
    çļĨ
    -0.14
    alled
    -0.13
     *
    -0.13
    POSITIVE LOGITS
     Its
    0.24
     its
    0.23
    Its
    0.23
     оно
    0.19
    aviest
    0.17
     itself
    0.16
     alone
    0.16
    ï¼Įå®ĥ
    0.16
    å®ĥ
    0.16
    -même
    0.15
    Act Density 0.125%

    No Known Activations