INDEX
Explanations
mentions of the word "soft" with varying intensities
New Auto-Interp
Negative Logits
agher
-0.73
reon
-0.72
ulhu
-0.70
OUGH
-0.70
naires
-0.67
Ancients
-0.64
Pax
-0.63
IVERS
-0.62
DEV
-0.61
Forever
-0.61
POSITIVE LOGITS
ening
1.24
ener
1.21
ball
1.17
hearted
1.05
eners
1.04
cover
1.00
palate
0.97
ened
0.95
balls
0.95
est
0.91
Activations Density 0.013%