INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    thirty
    0.46
    三十
    0.42
    𝟯
    0.40
    ệp
    0.39
    Thirty
    0.39
    ជំ
    0.38
    <unused39>
    0.38
    0.38
    দ্বিতীয়
    0.37
     Thirty
    0.37
    POSITIVE LOGITS
     "
    0.41
    7
    0.40
     LaTeX
    0.40
     JavaScript
    0.37
     &
    0.37
     est
    0.37
     *
    0.36
    &
    0.36
     system
    0.35
     <
    0.35
    Act Density 0.006%

    No Known Activations