INDEX
Explanations
elements related to community values and guiding principles
New Auto-Interp
Negative Logits
compressed
-0.14
elyn
-0.14
oug
-0.13
oucher
-0.13
.jetbrains
-0.13
_clock
-0.13
bordered
-0.13
oux
-0.13
stroy
-0.13
.mixin
-0.13
POSITIVE LOGITS
inform
0.50
informs
0.47
inform
0.45
informed
0.44
informing
0.44
Inform
0.43
Inform
0.41
shape
0.35
shapes
0.32
under
0.32
Activations Density 0.241%