INDEX
    Explanations

    interrogative phrases and questions directed at individuals about their experiences and preferences

    New Auto-Interp
    Negative Logits
    elter
    -0.16
    uzu
    -0.15
    annes
    -0.15
    amm
    -0.15
    yle
    -0.15
    ůst
    -0.14
     Farmer
    -0.14
    razier
    -0.14
     wonders
    -0.14
    usk
    -0.14
    POSITIVE LOGITS
     advice
    0.24
     Advice
    0.20
    Advice
    0.17
     message
    0.15
     advise
    0.15
     sunk
    0.15
    favorite
    0.15
    使
    0.15
     favorite
    0.15
    vala
    0.14
    Act Density 0.044%

    No Known Activations