INDEX
    Explanations

    phrases related to instructing actions or emphasizing consequences

    New Auto-Interp
    Negative Logits
     hairc
    -1.45
     matel
    -1.38
     milf
    -1.38
     !...
    -1.37
     perfet
    -1.36
     Cfr
    -1.34
     milano
    -1.34
     Juf
    -1.32
     ?...
    -1.32
     exé
    -1.32
    POSITIVE LOGITS
     realize
    0.67
     enjoy
    0.67
     look
    0.66
     become
    0.65
     introduce
    0.64
     try
    0.64
     make
    0.64
     tell
    0.63
     say
    0.62
    إذا
    0.62
    Act Density 0.239%

    No Known Activations