INDEX
Explanations
potential negative outcomes
New Auto-Interp
Negative Logits
tragedy
1.06
tragic
1.00
heartbreaking
0.91
horrific
0.88
deadly
0.88
horrifying
0.85
fatalities
0.84
crisis
0.84
failure
0.84
Tragedy
0.84
POSITIVE LOGITS
Risks
0.92
βοη
0.92
Ris
0.90
stimulates
0.89
ISI
0.87
Ris
0.85
risks
0.84
带来
0.83
potential
0.83
Potential
0.83
Activations Density 0.545%