INDEX
Explanations
positive qualities or characteristics related to an individual
qualities and attributes that contribute to a person's character and popularity
New Auto-Interp
Negative Logits
Situation
-0.74
akh
-0.67
COP
-0.64
inaction
-0.64
Timeline
-0.63
Activity
-0.63
Procedure
-0.63
/?
-0.62
odynamic
-0.61
Procedures
-0.61
POSITIVE LOGITS
matched
1.04
outwe
1.01
unmatched
1.00
contrasted
0.99
outweigh
0.97
overshadow
0.94
shone
0.94
contagious
0.93
compliment
0.93
translate
0.91
Activations Density 0.384%