INDEX
Explanations
phrases related to emotional reactions and decision-making
verbs related to emotions and negative experiences
New Auto-Interp
Negative Logits
maxwell
-0.71
peat
-0.70
eworks
-0.70
marine
-0.69
adra
-0.67
paio
-0.67
abin
-0.67
phabet
-0.66
portion
-0.65
"]=>
-0.65
POSITIVE LOGITS
!/
0.62
retreating
0.62
Spac
0.61
inaction
0.61
exhausted
0.61
Pitch
0.61
Gim
0.61
Centauri
0.60
[|
0.59
exile
0.58
Activations Density 0.181%