INDEX
    Explanations

    elements related to programming functions and variable definitions

    New Auto-Interp
    Negative Logits
    TagMode
    -0.39
    2
    -0.31
    1
    -0.29
    8
    -0.28
     stessa
    -0.28
     مم
    -0.28
     zufolge
    -0.28
    ouvrage
    -0.28
     einz
    -0.28
    3
    -0.27
    POSITIVE LOGITS
     zwiſchen
    0.85
    ðsíða
    0.83
     queſta
    0.77
     ſehr
    0.76
     للمعارف
    0.76
    ſehen
    0.76
     وتسجيلات
    0.75
    ſſung
    0.75
     ddelwed
    0.75
     témoig
    0.74
    Act Density 0.585%

    No Known Activations