INDEX
    Explanations

    hyphenated descriptions

    New Auto-Interp
    Negative Logits
    -
    1.31
    ـ
    1.12
    -/-
    1.05
    ــ
    1.04
    -//
    1.00
    -}\
    0.99
    -(\
    0.97
    -”
    0.96
    -'+
    0.95
     typos
    0.95
    POSITIVE LOGITS
    based
    1.93
    sized
    1.81
    themed
    1.75
    related
    1.61
    shaped
    1.59
    level
    1.59
    oriented
    1.57
    driven
    1.57
    induced
    1.54
    friendly
    1.54
    Act Density 0.750%

    No Known Activations