INDEX
    Explanations

    expressions indicating reaching a high or extreme level or taking drastic actions

    phrases indicating limits or boundaries in actions or ideas

    New Auto-Interp
    Negative Logits
    itu
    -0.76
    bed
    -0.68
    ivered
    -0.65
     Sovere
    -0.65
    ipation
    -0.63
    ixture
    -0.63
     Nep
    -0.62
    ukes
    -0.61
    stable
    -0.59
    ixtures
    -0.59
    POSITIVE LOGITS
    WARD
    0.89
     sidx
    0.82
     overboard
    0.81
     derog
    0.79
     towards
    0.75
     unnoticed
    0.74
    irtual
    0.74
     toward
    0.74
     boldly
    0.72
     lengths
    0.68
    Act Density 0.057%

    No Known Activations