INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    三百
    1.07
    Forty
    0.93
    Thirty
    0.93
     Forty
    0.89
     hundreds
    0.88
     Thirty
    0.88
    Fifty
    0.87
    二百
    0.87
     Sixty
    0.83
     Fifty
    0.83
    POSITIVE LOGITS
     eight
    1.75
     ten
    1.70
    8
    1.61
     ocho
    1.56
     seven
    1.56
    1.52
    ১০
    1.51
     TEN
    1.46
     १०
    1.45
     ۱۰
    1.45
    Act Density 0.870%

    No Known Activations