INDEX
    Explanations

    references to the word "who"

    New Auto-Interp
    Negative Logits
    ซ์
    -0.61
     transporta
    -0.60
    lans
    -0.59
     Erle
    -0.58
    しかない
    -0.57
    a
    -0.56
    nica
    -0.56
    を与える
    -0.56
    ן
    -0.56
     **/
    
    -0.55
    POSITIVE LOGITS
     who
    2.10
     Who
    2.10
    Who
    2.02
    who
    1.94
     WHO
    1.80
    WHO
    1.80
     hvem
    1.59
     whom
    1.58
     quién
    1.56
    quién
    1.55
    Act Density 0.039%

    No Known Activations