INDEX
Negative Logits
atrium
0.49
verstehen
0.42
montrer
0.41
MOQ
0.41
definir
0.40
venir
0.39
arrivée
0.38
Schmidt
0.38
வில
0.38
ヴィ
0.38
POSITIVE LOGITS
Oops
0.69
Oops
0.64
inadvert
0.57
apologies
0.57
inadvertently
0.56
oops
0.55
sorry
0.55
accidentally
0.54
Sorry
0.52
Sorry
0.52
Activations Density 0.001%