INDEX
    Explanations

    references to the word "who" and its variants in different contexts

    New Auto-Interp
    Negative Logits
     experimenta
    -0.61
     Brin
    -0.59
     grosse
    -0.57
     Optimum
    -0.57
    ته
    -0.56
    تمام
    -0.56
     experiences
    -0.56
    Brin
    -0.55
    ματα
    -0.53
    Eileen
    -0.53
    POSITIVE LOGITS
     Who
    1.69
    Who
    1.61
     who
    1.60
    who
    1.44
     WHO
    1.37
    WHO
    1.35
     whom
    1.35
     hvem
    1.31
    Quem
    1.31
    Кто
    1.30
    Act Density 0.058%

    No Known Activations