INDEX
Explanations
phrases related to long-term considerations or consequences
phrases related to long-term and short-term outcomes
New Auto-Interp
Negative Logits
othing
-0.73
yne
-0.71
MSN
-0.68
Bleach
-0.68
ymes
-0.67
Flavoring
-0.64
ofer
-0.64
utenberg
-0.64
zo
-0.63
atche
-0.63
POSITIVE LOGITS
term
0.98
term
0.97
aftermath
0.94
absence
0.90
afterlife
0.89
confines
0.88
timeframe
0.83
haul
0.80
vicinity
0.79
run
0.79
Activations Density 0.058%