INDEX
Explanations
words related to dominance or control in various contexts
terms related to dominance or control in various contexts
New Auto-Interp
Negative Logits
pt
-0.69
lev
-0.67
endment
-0.65
aration
-0.65
spir
-0.61
othy
-0.60
ilk
-0.59
ne
-0.59
resso
-0.59
ead
-0.59
POSITIVE LOGITS
dominated
0.74
headlines
0.73
ICS
0.69
SHIP
0.69
dominates
0.69
overshadowed
0.67
pread
0.65
byss
0.65
Cav
0.64
dominate
0.63
Activations Density 0.035%