INDEX
    Explanations

    phrases indicating a degree of intensity or comparison

    phrases indicating a degree of messiness or complexity

    New Auto-Interp
    Negative Logits
    rats
    -0.84
    reys
    -0.81
    ħĭ
    -0.80
    metics
    -0.77
    pects
    -0.76
    rates
    -0.76
    uers
    -0.75
    oons
    -0.74
    ards
    -0.74
    atars
    -0.74
    POSITIVE LOGITS
     luck
    1.01
     overlap
    0.85
     extra
    0.76
     mischief
    0.74
     trouble
    0.73
     elbow
    0.73
    angu
    0.72
     irony
    0.72
     misinformation
    0.71
     realism
    0.71
    Act Density 0.044%

    No Known Activations