INDEX
    Explanations
    No Explanations Found
    New Auto-Interp
    Negative Logits
     eleven
    1.03
     sixteen
    0.99
     eighteen
    0.97
     Sixteen
    0.96
     thirteen
    0.94
     seventeen
    0.94
     fourteen
    0.91
     eleventh
    0.90
     twelve
    0.89
     Fourteen
    0.89
    POSITIVE LOGITS
    3
    1.92
    4
    1.81
    5
    1.70
    2
    1.54
    6
    1.53
    7
    1.51
    9
    1.44
    8
    1.44
    1
    1.33
    0
    1.13
    Act Density 1.282%

    No Known Activations