INDEX
    Explanations

    negations and questions about preferences or intentions

    New Auto-Interp
    Negative Logits
    .central
    -0.14
    Gatt
    -0.14
    GV
    -0.14
     коÑģÑĤ
    -0.14
    .scalablytyped
    -0.14
    canf
    -0.14
    atis
    -0.14
    uke
    -0.14
    ove
    -0.14
    inati
    -0.14
    POSITIVE LOGITS
     wouldn
    0.23
     want
    0.20
     mind
    0.20
     dream
    0.20
     rather
    0.19
    mind
    0.17
     trade
    0.17
    nt
    0.17
     Dream
    0.17
    梦
    0.17
    Act Density 0.069%

    No Known Activations