INDEX
Explanations
mentions of self-forgiveness and anxiety-related behaviors
New Auto-Interp
Negative Logits
advoc
-0.55
aven
-0.55
opponent
-0.54
affili
-0.53
boarded
-0.53
odo
-0.52
ulic
-0.52
docks
-0.52
healer
-0.50
dips
-0.50
POSITIVE LOGITS
Afterwards
0.91
Nevertheless
0.87
Consequently
0.86
Nonetheless
0.85
Otherwise
0.84
Moreover
0.83
However
0.83
Furthermore
0.81
Later
0.81
Meanwhile
0.81
Activations Density 1.935%