INDEX
    Explanations

    references to numerical values, specifically large figures

    New Auto-Interp
    Negative Logits
    MSN
    -0.79
    href
    -0.76
    lez
    -0.75
     stead
    -0.73
    cho
    -0.72
    ration
    -0.69
    ffield
    -0.67
    imen
    -0.67
    elly
    -0.65
    ravings
    -0.65
    POSITIVE LOGITS
    mAh
    1.05
    8000
    1.03
     8000
    0.89
     6000
    0.89
    6000
    0.87
    4000
    0.83
     7000
    0.82
     5000
    0.81
     4000
    0.81
     9000
    0.80
    Act Density 0.009%

    No Known Activations