INDEX
Explanations
phrases describing dual roles or functions
phrases emphasizing contrasting dualities or relationships
New Auto-Interp
Negative Logits
cies
-0.83
zens
-0.78
cats
-0.77
в
-0.76
estyles
-0.75
ð
-0.74
ICES
-0.71
tz
-0.71
roups
-0.71
ibilities
-0.69
POSITIVE LOGITS
protector
1.40
savior
1.23
centerpiece
1.15
catalyst
1.14
confid
1.12
facilit
1.10
healer
1.05
conduit
1.03
undermin
1.01
scourge
1.01
Activations Density 0.240%