INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Fonte
    0.46
    Princeton
    0.42
    HMS
    0.41
     footnotes
    0.40
    0.40
     usern
    0.39
     puntual
    0.38
     fonte
    0.38
     Atlet
    0.37
     Hunde
    0.37
    POSITIVE LOGITS
     seventh
    0.73
     tujuh
    0.69
     seven
    0.68
     ஏழு
    0.68
    0.68
    0.65
     Seven
    0.64
     सात
    0.64
     هفت
    0.63
    seven
    0.63
    Act Density 0.075%

    No Known Activations