INDEX
    Explanations

    instances of the word "it" across various contexts

    New Auto-Interp
    Negative Logits
     Either
    -0.15
     either
    -0.15
    anca
    -0.15
     Sokol
    -0.14
    either
    -0.14
    wards
    -0.13
    決
    -0.13
    cÃŃ
    -0.13
    Either
    -0.13
    ories
    -0.13
    POSITIVE LOGITS
     pert
    0.23
     done
    0.20
    pert
    0.19
     always
    0.19
    always
    0.17
     Pert
    0.16
     relates
    0.16
    -done
    0.15
     siempre
    0.15
     happens
    0.15
    Act Density 0.117%

    No Known Activations