INDEX
Explanations
links that refer to additional content or continuation of the article
mentions of "below" in the text
New Auto-Interp
Negative Logits
mut
-0.63
restraint
-0.63
propri
-0.60
pretext
-0.60
capacity
-0.59
supp
-0.59
reincarn
-0.58
real
-0.58
drive
-0.57
priority
-0.57
POSITIVE LOGITS
Below
3.83
Below
2.45
BELOW
2.19
below
2.06
Above
1.99
below
1.82
Above
1.61
Here
1.46
above
1.37
Within
1.35
Activations Density 0.009%