INDEX
    Explanations

    words related to advice or suggestions

    New Auto-Interp
    Negative Logits
    anos
    -0.68
    uador
    -0.64
    mare
    -0.63
    mber
    -0.63
    opol
    -0.61
    opoly
    -0.61
    kas
    -0.59
    eries
    -0.59
     Scorpion
    -0.59
    bley
    -0.59
    POSITIVE LOGITS
    },"
    0.68
    .",
    0.64
     :)
    0.63
    .:
    0.62
    ));
    0.62
    .","
    0.62
    ]);
    0.62
     :-)
    0.62
    .
    0.61
     as
    0.60
    Act Density 0.035%

    No Known Activations