INDEX
Explanations
phrases related to the ability to accomplish tasks or actions
New Auto-Interp
Negative Logits
Parents
-0.72
Tradition
-0.70
Sins
-0.68
Generation
-0.67
Caval
-0.67
Alone
-0.66
Belief
-0.65
Bots
-0.64
Famous
-0.64
Principles
-0.64
POSITIVE LOGITS
ioned
1.04
bodied
0.97
reys
0.92
't
0.92
uate
0.80
awaru
0.80
ords
0.80
istically
0.78
ittees
0.78
Reviewer
0.77
Activations Density 5.140%