INDEX
    Explanations

    phrases questioning or exploring reasons for certain occurrences

    expressions of curiosity or inquiries into reasons

    New Auto-Interp
    Negative Logits
    aughed
    -0.85
    iece
    -0.82
    lator
    -0.80
     pione
    -0.80
    đ
    -0.77
    vertisement
    -0.76
    rawdownload
    -0.76
    Ă
    -0.76
    û
    -0.76
    ø
    -0.76
    POSITIVE LOGITS
    soever
    0.98
     they
    0.94
     exactly
    0.84
     people
    0.82
     we
    0.80
     nobody
    0.79
     someone
    0.79
     somebody
    0.76
     there
    0.75
     anyone
    0.73
    Act Density 0.049%

    No Known Activations