INDEX
Explanations
instances where actions or responsibilities are being taken seriously
references to responsibilities, concerns, and issues
New Auto-Interp
Negative Logits
etheless
-0.80
entimes
-0.65
accompanies
-0.64
furt
-0.61
los
-0.60
ITNESS
-0.60
izabeth
-0.59
whose
-0.57
DA
-0.55
nor
-0.55
POSITIVE LOGITS
squarely
0.91
aside
0.86
down
0.82
into
0.81
overboard
0.81
seriously
0.75
intact
0.74
onto
0.73
firmly
0.72
forward
0.72
Activations Density 0.204%