INDEX
Explanations
negative emotions and feelings
New Auto-Interp
Negative Logits
proficiency
0.51
toughness
0.49
robustness
0.47
adherence
0.47
specialization
0.47
inactivity
0.46
eloquence
0.46
propensity
0.46
parametrization
0.46
friendliness
0.45
POSITIVE LOGITS
excited
1.70
scared
1.70
frightened
1.70
frustrated
1.69
annoyed
1.68
anxious
1.67
terrified
1.63
angry
1.63
confused
1.58
irritated
1.58
Activations Density 0.077%