INDEX
Explanations
references to religious figures, specifically those related to the title "Reverend."
New Auto-Interp
Negative Logits
holders
-0.83
bags
-0.69
boxes
-0.65
remlin
-0.64
falls
-0.63
proximity
-0.61
stuffing
-0.60
explosives
-0.60
WAYS
-0.59
aud
-0.59
POSITIVE LOGITS
isions
1.28
olutions
1.20
olt
1.06
olver
1.02
ayne
1.01
ivals
1.00
olving
0.98
ision
0.95
ocation
0.93
olves
0.93
Activations Density 0.016%