INDEX
Explanations
phrases related to forgiveness
terms related to forgiveness
New Auto-Interp
Negative Logits
anto
-0.74
eding
-0.72
ulhu
-0.71
Hunt
-0.70
amen
-0.70
hack
-0.67
Hop
-0.67
rus
-0.67
urg
-0.66
psey
-0.66
POSITIVE LOGITS
forgiven
1.30
forgiveness
1.20
forgive
1.12
forgiving
0.90
pard
0.88
give
0.87
forg
0.83
pardon
0.83
giving
0.82
ivable
0.81
Activations Density 0.010%