INDEX
Explanations
expressions of apology or acknowledgement of wrongdoing
New Auto-Interp
Negative Logits
OGND
-0.38
sendok
-0.36
թվական
-0.36
pranks
-0.35
באתר
-0.35
experiments
-0.35
municipales
-0.34
igång
-0.34
vecin
-0.34
geprek
-0.34
POSITIVE LOGITS
disambiguazione
0.68
apologise
0.66
Apo
0.65
Apo
0.64
Qual
0.64
sorry
0.64
apologize
0.63
يتيمه
0.63
qualify
0.62
qualifies
0.62
Activations Density 1.671%