INDEX
    Explanations

    expressions of gratitude or appreciation

    expressing gratitude or pride

    New Auto-Interp
    Negative Logits
     ujednoznacz
    -0.60
     otomatig
    -0.51
    })*/
    -0.50
    -0.50
    UserScript
    -0.48
     autorytatywna
    -0.47
     Meksiku
    -0.47
    }*/
    
    -0.46
    Архівовано
    -0.46
    Зноскі
    -0.46
    POSITIVE LOGITS
     overras
    0.51
     freue
    0.49
     proud
    0.47
     NgModule
    0.47
     remercie
    0.44
     glad
    0.43
     erwar
    0.43
     heureux
    0.43
     congrat
    0.42
     excited
    0.42
    Act Density 0.010%

    No Known Activations