INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    7
    1.47
    8
    1.41
    9
    1.37
    1
    1.35
     escolas
    1.29
    algèbre
    1.28
     संग्रहालय
    1.27
    6
    1.25
    2
    1.24
    微分
    1.24
    POSITIVE LOGITS
     spontaneously
    1.04
     shedding
    1.03
     slowly
    1.02
     heavily
    0.99
     hovered
    0.99
     sweaty
    0.98
     stick
    0.98
     carrot
    0.97
     wrapped
    0.97
     awhile
    0.96
    Act Density 0.074%

    No Known Activations