INDEX
Explanations
expressions of congratulations and recognition
New Auto-Interp
Negative Logits
erent
-0.51
ftagPool
-0.50
boot
-0.49
Diam
-0.49
skyl
-0.48
challenges
-0.48
ⓧ
-0.48
garn
-0.48
usermodel
-0.47
😢
-0.46
POSITIVE LOGITS
dispatch
1.20
dispatch
1.07
Dispatch
0.98
Congratulations
0.93
congratulations
0.92
Dispatch
0.91
Congrats
0.90
congrats
0.85
Congratulations
0.84
aéri
0.84
Activations Density 0.099%