INDEX
Explanations
phrases containing the word "only" followed by a number
repeated phrases emphasizing exclusivity or a singular aspect
New Auto-Interp
Negative Logits
azard
-0.72
ruary
-0.72
etz
-0.72
align
-0.71
mas
-0.71
each
-0.69
des
-0.69
comb
-0.68
enthusi
-0.66
staking
-0.65
POSITIVE LOGITS
thing
1.23
downside
1.15
drawback
1.13
reason
1.13
remaining
1.12
exception
1.11
difference
1.03
caveat
1.01
surviving
0.98
conceivable
0.96
Activations Density 0.040%