INDEX
    Explanations

    questions that begin with "who."

    New Auto-Interp
    Negative Logits
    mente
    -0.19
    ted
    -0.17
    illos
    -0.15
    erais
    -0.15
    tti
    -0.15
    taire
    -0.15
    ting
    -0.15
    Carthy
    -0.15
    ning
    -0.14
    uran
    -0.14
    POSITIVE LOGITS
     else
    0.24
    osh
    0.18
    soever
    0.17
    _else
    0.16
    ops
    0.15
    oping
    0.14
    	else
    0.14
    ÑĢей
    0.14
    inspace
    0.14
     ELSE
    0.14
    Act Density 0.049%

    No Known Activations