INDEX
    Explanations

    phrases indicating functionality and effectiveness

    New Auto-Interp
    Negative Logits
    orianCalendar
    -0.69
     للمعارف
    -0.68
    IntoConstraints
    -0.68
     Montagu
    -0.63
    Agora
    -0.62
     Skor
    -0.62
    __*/
    -0.61
    olyte
    -0.60
    <>("
    -0.60
    -0.58
    POSITIVE LOGITS
     works
    1.03
     Works
    0.91
     funktioniert
    0.90
    Works
    0.89
     WORKS
    0.88
    works
    0.87
     fungerar
    0.85
     fungerer
    0.84
     worked
    0.82
     funguje
    0.82
    Act Density 0.152%

    No Known Activations