INDEX
    Explanations

    references to specific numbers or numerical figures

    New Auto-Interp
    Negative Logits
    5
    -0.96
    4
    -0.88
    7
    -0.88
    3
    -0.86
    6
    -0.81
    9
    -0.80
    8
    -0.77
    0
    -0.72
    2
    -0.71
    1
    -0.68
    POSITIVE LOGITS
     Seven
    2.16
     Nine
    2.13
    Seven
    2.12
    Nine
    2.06
     Six
    2.05
     nine
    2.02
     Eight
    1.97
     eight
    1.96
    Six
    1.95
    seven
    1.93
    Act Density 0.108%

    No Known Activations