INDEX
Explanations
phrases that describe perceptions and qualities of objects or experiences
New Auto-Interp
Negative Logits
});*/
-0.75
)");
-0.70
}));
-0.70
CanadaChoose
-0.66
"){-0.66
++){
-0.66
)";
-0.66
")){
-0.65
)*/
-0.65
("")]
-0.65
POSITIVE LOGITS
looks
1.12
Looks
1.09
Looks
1.09
looks
1.08
appear
0.96
APPE
0.94
sounded
0.93
appears
0.93
sounding
0.90
looked
0.88
Activations Density 0.174%