INDEX
    Explanations

    comparisons and evaluations between different entities or situations

    descriptions of surprising or distressing situations

    New Auto-Interp
    Negative Logits
    ocrates
    -0.69
    anners
    -0.65
    oline
    -0.63
     etc
    -0.60
    rained
    -0.60
     decaying
    -0.58
    ustomed
    -0.57
    uberty
    -0.56
     whoever
    -0.56
    PLIED
    -0.56
    POSITIVE LOGITS
     notable
    0.70
     noteworthy
    0.70
    ici
    0.69
     besides
    0.68
    icy
    0.67
     interven
    0.63
    word
    0.62
    Featured
    0.62
     downside
    0.62
     eyebrow
    0.61
    Act Density 0.357%

    No Known Activations