INDEX
Explanations
instances where someone is able to accomplish a task successfully
instances of the word "able" indicating capability or proficiency
New Auto-Interp
Negative Logits
Alone
-0.72
Parents
-0.72
Caval
-0.70
Yose
-0.69
Deer
-0.68
Famous
-0.66
origin
-0.66
sterling
-0.65
Warcraft
-0.65
Tradition
-0.64
POSITIVE LOGITS
ioned
0.93
bodied
0.90
reys
0.89
't
0.80
itud
0.78
llor
0.77
manage
0.77
access
0.76
awaru
0.76
¶ħ
0.75
Activations Density 0.030%