INDEX
Explanations
expressions of apology and acknowledgment of errors
New Auto-Interp
Negative Logits
Trouver
-0.51
didSet
-0.50
]")]
-0.49
ivably
-0.46
evaluates
-0.45
vyk
-0.45
Relatively
-0.44
годно
-0.43
интересно
-0.43
evaluation
-0.43
POSITIVE LOGITS
sorry
2.51
sorry
2.21
apologies
2.19
Sorry
2.17
SORRY
2.13
apologize
2.10
Sorry
2.04
apology
1.92
Apologies
1.85
apologise
1.83
Activations Density 0.147%