INDEX
Explanations
quantitative information such as numbers, counts, and percentages
references to violence in media
New Auto-Interp
Negative Logits
manoeuv
-0.63
-)
-0.59
!'
-0.58
endif
-0.57
instincts
-0.56
;)
-0.56
rador
-0.55
*)
-0.54
meanwhile
-0.54
Sov
-0.54
POSITIVE LOGITS
respective
0.63
iann
0.62
Including
0.57
ategories
0.55
DCS
0.55
each
0.54
respectively
0.54
excluding
0.54
Wells
0.52
isode
0.52
Activations Density 1.605%