INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     recentes
    -0.81
     vostri
    -0.80
     passés
    -0.79
    ✨:
    -0.76
     grze
    -0.76
     loisirs
    -0.76
     variés
    -0.75
    agissait
    -0.74
     tuoi
    -0.74
     Zeke
    -0.72
    POSITIVE LOGITS
     window
    1.88
     Window
    1.72
     windows
    1.68
    window
    1.61
     WINDOW
    1.52
    Window
    1.46
    WINDOW
    1.43
    windows
    1.38
     Windows
    1.34
    Windows
    1.33
    Act Density 0.036%

    No Known Activations