INDEX
Explanations
words related to control and power
instances of the word "control" in various contexts
New Auto-Interp
Negative Logits
igmat
-0.70
udget
-0.62
rouse
-0.62
eday
-0.57
Zeit
-0.56
rehend
-0.56
enegger
-0.56
GGGGGGGG
-0.55
Lect
-0.55
sung
-0.55
POSITIVE LOGITS
over
1.16
over
0.98
of
0.92
orship
0.88
overs
0.87
OVER
0.82
thereof
0.82
ignty
0.81
holding
0.80
exercised
0.78
Activations Density 0.089%