INDEX
    Explanations

    verbs indicating action or intention

    phrases that indicate necessity, capability, or ongoing actions

    New Auto-Interp
    Negative Logits
    bury
    -0.64
     KL
    -0.56
    hack
    -0.54
     Sly
    -0.52
     Britain
    -0.52
     paternity
    -0.51
     Pigs
    -0.50
     Truth
    -0.50
     Seah
    -0.49
    hin
    -0.49
    POSITIVE LOGITS
    ãĥĩãĤ£
    0.65
    âĵĺ
    0.65
    ãĥĦ
    0.64
    pmwiki
    0.64
    Tokens
    0.63
     nevertheless
    0.61
     nonetheless
    0.59
     partName
    0.59
     sugg
    0.58
     downright
    0.58
    Act Density 0.986%

    No Known Activations