INDEX
Explanations
expressions of pride and related sentiments
New Auto-Interp
Negative Logits
KF
-0.61
Composable
-0.58
worms
-0.57
__":
-0.56
Dior
-0.56
persons
-0.55
ergo
-0.54
Monks
-0.54
HandlerContext
-0.54
łaj
-0.52
POSITIVE LOGITS
proud
1.76
proud
1.76
Proud
1.70
Proud
1.64
orgull
1.60
pride
1.60
stolz
1.47
proudly
1.45
orgullo
1.40
pride
1.39
Activations Density 0.061%