INDEX
    Explanations

    symbols and punctuation marks used in mathematical or technical contexts

    Numbers, especially single digits

    New Auto-Interp
    Negative Logits
    Six
    -0.66
     Six
    -0.65
     Nine
    -0.65
    SIX
    -0.63
    Nine
    -0.60
    Eight
    -0.60
     Eight
    -0.59
    Seven
    -0.57
     nine
    -0.56
     Twenty
    -0.55
    POSITIVE LOGITS
    3
    1.13
    1
    1.10
    2
    1.06
    4
    1.05
    5
    0.98
    6
    0.92
    7
    0.90
    8
    0.85
    9
    0.80
    0
    0.71
    Act Density 0.101%

    No Known Activations