INDEX
    Explanations

    phrases indicating comparison or cause and effect

    phrases indicating increasing quantities or magnitude

    New Auto-Interp
    Negative Logits
    istani
    -0.87
    ahime
    -0.79
    ioch
    -0.76
    edit
    -0.73
    aido
    -0.71
    iasco
    -0.70
    lease
    -0.70
    idon
    -0.69
    opol
    -0.69
    amon
    -0.68
    POSITIVE LOGITS
     better
    1.46
     harder
    1.45
     worse
    1.39
     stronger
    1.39
     louder
    1.36
     clearer
    1.35
     quicker
    1.32
     greater
    1.32
     easier
    1.31
     happier
    1.30
    Act Density 0.029%

    No Known Activations