INDEX
    Explanations

    phrases related to expressing ideas or opinions in a clear and direct manner

    New Auto-Interp
    Negative Logits
    egal
    -0.89
    lav
    -0.70
    emis
    -0.68
     heavily
    -0.65
     particularly
    -0.64
    ittal
    -0.63
     extensively
    -0.63
    cler
    -0.62
    orno
    -0.61
     defenders
    -0.61
    POSITIVE LOGITS
     stated
    0.79
    IFIED
    0.73
    ify
    0.72
     minded
    0.70
    ified
    0.68
     guessed
    0.67
     stating
    0.65
    ername
    0.65
    ifying
    0.64
     clicking
    0.63
    Act Density 0.032%

    No Known Activations