INDEX
Explanations
negative constructions and prohibitions in language
New Auto-Interp
Negative Logits
SPDX
-0.16
ko
-0.15
stru
-0.14
vrd
-0.14
CAD
-0.14
zig
-0.14
-Za
-0.14
enever
-0.14
Enumerator
-0.13
Vill
-0.13
POSITIVE LOGITS
uchi
0.17
oran
0.17
proof
0.15
vail
0.14
åī¯
0.14
abler
0.14
partial
0.14
dint
0.14
aku
0.14
ange
0.14
Activations Density 0.003%