INDEX
Explanations
commands or actions related to enabling or disabling functions or features
references to disabling or enabling features and functionalities
New Auto-Interp
Negative Logits
eal
-0.85
alez
-0.83
aternity
-0.79
rio
-0.75
alg
-0.74
kaya
-0.74
ablishment
-0.71
Mart
-0.70
eday
-0.70
arah
-0.70
POSITIVE LOGITS
disable
1.27
disabling
1.16
Disable
0.98
inhibitor
0.91
ments
0.90
MENTS
0.90
inhibitors
0.89
inhibition
0.88
disable
0.84
MENT
0.84
Activations Density 0.018%