INDEX
Explanations
phrases starting with "in all" followed by different contexts or situations
phrases that emphasize universality or totality
New Auto-Interp
Negative Logits
lash
-0.70
Schwar
-0.64
LOG
-0.64
Fenrir
-0.63
rouse
-0.63
potion
-0.62
uced
-0.61
Tale
-0.61
label
-0.58
heed
-0.58
POSITIVE LOGITS
likelihood
1.23
seriousness
1.20
honesty
1.20
respects
1.19
ocating
1.15
directions
1.09
fairness
1.08
phases
1.07
contexts
1.05
probability
1.04
Activations Density 0.050%