INDEX
Explanations
phrases indicating recognition or acknowledgment of achievements
New Auto-Interp
Negative Logits
-0.83
Chwiliwch
-0.81
Xna
-0.76
UserScript
-0.73
enderror
-0.70
CUC
-0.68
الرياضيه
-0.68
ⓧ
-0.67
Bux
-0.67
Jove
-0.67
POSITIVE LOGITS
égard
0.51
</u>
0.43
versus
0.43
c
0.42
Silva
0.42
vs
0.41
Kurz
0.41
lingua
0.41
did
0.40
させていただきました
0.39
Activations Density 0.271%