INDEX
    Explanations

    references to specific functionalities or usage instructions related to technology or abilities

    Text after "simply" or variants

    introducing instructions or actions

    New Auto-Interp
    Negative Logits
    findpost
    -0.68
     Wegen
    -0.55
     dared
    -0.54
    حياته
    -0.53
     eventual
    -0.53
    Trotz
    -0.52
     Rücksicht
    -0.52
    urther
    -0.52
    waitKey
    -0.51
    それでも
    -0.50
    POSITIVE LOGITS
     simply
    1.39
    Simply
    1.36
     Simply
    1.28
    simply
    1.25
     einfach
    1.11
     simplemente
    0.98
     simplement
    0.95
     פשוט
    0.94
     semplicemente
    0.92
    Einfach
    0.92
    Act Density 0.512%

    No Known Activations