INDEX
    Explanations

    phrases related to justifications and conditions

    New Auto-Interp
    Negative Logits
     growing
    -0.77
     Increasing
    -0.76
     Growing
    -0.75
     increasing
    -0.74
     Changing
    -0.74
     changing
    -0.73
    Growing
    -0.73
    growing
    -0.71
    Increasing
    -0.71
    Changing
    -0.70
    POSITIVE LOGITS
     saying
    1.66
     stating
    1.47
     noting
    1.43
     suggesting
    1.37
    saying
    1.35
     mentioning
    1.35
     wondering
    1.32
     describing
    1.31
     asking
    1.27
     thinking
    1.26
    Act Density 0.493%

    No Known Activations