INDEX
Explanations
phrases concerning informed decision-making and personal agency
New Auto-Interp
Negative Logits
заб
-0.20
igue
-0.18
orda
-0.15
ç§
-0.15
Ìĥ
-0.14
ancock
-0.14
configurable
-0.14
568
-0.14
ạp
-0.14
cott
-0.14
POSITIVE LOGITS
informed
0.37
educated
0.32
choices
0.26
snap
0.25
educated
0.25
-educated
0.24
wise
0.23
sound
0.23
smart
0.22
intelligent
0.22
Activations Density 0.066%