INDEX
Explanations
descriptors of friendly and courteous interactions
New Auto-Interp
Negative Logits
Aufs
-0.57
publicidad
-0.54
sohn
-0.54
cometh
-0.54
pubblicità
-0.52
McKe
-0.52
paraître
-0.51
nokt
-0.51
imprend
-0.51
hemato
-0.50
POSITIVE LOGITS
friendly
2.82
Friendly
2.61
friendly
2.54
Friendly
2.52
friendliness
1.72
unfriendly
1.68
freundlichen
1.34
vriende
1.34
友好
1.28
freund
1.26
Activations Density 0.040%