INDEX
    Explanations

    prompts asking for opinions or thoughts

    questions asking for opinions or thoughts

    New Auto-Interp
    Negative Logits
    announced
    -0.79
     Adin
    -0.71
    clad
    -0.70
    iere
    -0.66
    licensed
    -0.65
    itz
    -0.62
    wealth
    -0.62
    known
    -0.61
    Fund
    -0.60
    documented
    -0.60
    POSITIVE LOGITS
    estyles
    0.72
    76561
    0.70
     constitu
    0.68
     about
    0.67
    aptic
    0.67
    IUM
    0.66
    rison
    0.65
     disapprove
    0.64
     ABOUT
    0.64
     aloud
    0.64
    Act Density 0.029%

    No Known Activations