INDEX
    Explanations

    phrases indicating intensity or comparison

    phrases emphasizing the concept of minimization or a lower bound

    New Auto-Interp
    Negative Logits
    kefeller
    -0.70
    xs
    -0.69
    rows
    -0.68
    inal
    -0.63
    borg
    -0.63
    asms
    -0.63
    gaard
    -0.61
    dal
    -0.59
    cats
    -0.59
    stocks
    -0.58
    POSITIVE LOGITS
    Gi
    0.77
     suffice
    0.71
    Lago
    0.70
    orah
    0.68
    FontSize
    0.65
    agogue
    0.63
     provocation
    0.62
     assume
    0.62
    ruck
    0.60
    taining
    0.60
    Act Density 0.037%

    No Known Activations