INDEX
Explanations
terms related to dominance or prevailing conditions
New Auto-Interp
Negative Logits
<eos>
-0.70
</i>
-0.68
au
-0.63
en
-0.62
.
-0.60
</b>
-0.59
cer
-0.59
}
-0.59
-0.58
й
-0.56
POSITIVE LOGITS
dominate
1.97
DOMIN
1.95
domin
1.93
dominating
1.90
Domin
1.90
dominance
1.88
dominated
1.88
dominates
1.87
Domin
1.85
domin
1.84
Activations Density 0.153%