INDEX
Explanations
statements that express apologies or address misunderstandings
Expressions of regret or apology
apologies and mistakes
New Auto-Interp
Negative Logits
rangs
-0.39
ellt
-0.38
precies
-0.37
afat
-0.36
aarrggbb
-0.36
mengan
-0.36
眉头
-0.36
strani
-0.36
or
-0.35
ActiveRecord
-0.34
POSITIVE LOGITS
apologies
1.23
apologize
1.23
Oops
1.18
apologized
1.17
apology
1.11
apologise
1.10
apologizing
1.10
oops
1.09
Apologies
1.08
apologised
1.07
Activations Density 0.214%