INDEX
Explanations
phrases related to taking action or making decisions
phrases related to taking action or making significant decisions
New Auto-Interp
Negative Logits
ongo
-0.75
amation
-0.70
Weather
-0.70
uppet
-0.66
ongyang
-0.63
Origin
-0.63
FX
-0.62
nesota
-0.61
Ter
-0.61
Vari
-0.60
POSITIVE LOGITS
anew
0.98
strip
0.87
again
0.85
whereby
0.80
wisely
0.80
unilaterally
0.79
responsibly
0.78
willingly
0.77
naires
0.76
reluctantly
0.76
Activations Density 0.267%