INDEX
Explanations
words associated with confinement or restrictions
New Auto-Interp
Negative Logits
.Builder
-0.15
amura
-0.15
(Buffer
-0.15
deaux
-0.14
(Bundle
-0.14
/block
-0.14
mgr
-0.14
mund
-0.14
exact
-0.13
tries
-0.13
POSITIVE LOGITS
bs
0.67
b
0.67
ba
0.54
bed
0.54
б
0.50
be
0.49
bd
0.49
bb
0.49
bc
0.48
bin
0.48
Activations Density 0.148%