INDEX
Explanations
requests or instructions
references to actions or requests directed at the reader or audience
New Auto-Interp
Negative Logits
partic
-0.58
ement
-0.57
difficulty
-0.55
perm
-0.52
pains
-0.51
pole
-0.51
cod
-0.51
cknowled
-0.50
sensit
-0.50
derog
-0.50
POSITIVE LOGITS
See
0.69
\/\/
0.57
Videos
0.57
"))
0.56
á½
0.56
img
0.56
Shap
0.56
rina
0.55
Interstitial
0.55
gdala
0.54
Activations Density 0.019%