INDEX
Explanations
words related to aggression and friendliness
New Auto-Interp
Negative Logits
asciug
-0.78
issipp
-0.73
<h5>
-0.69
inset
-0.68
awtextra
-0.68
Moments
-0.68
zeug
-0.67
setof
-0.67
gefügt
-0.67
limone
-0.66
POSITIVE LOGITS
agres
1.00
aggres
0.94
aggressive
0.92
hostile
0.92
hostility
0.91
aggressiveness
0.89
aggressive
0.85
friendly
0.84
friendliness
0.82
Friendly
0.82
Activations Density 0.012%