INDEX
Explanations
words related to enabling or disabling settings or features
phrases related to enablement and configuration settings
New Auto-Interp
Negative Logits
apest
-0.95
ivan
-0.75
letes
-0.75
orians
-0.75
sheets
-0.75
ourcing
-0.75
Catal
-0.74
von
-0.74
ographers
-0.74
forts
-0.73
POSITIVE LOGITS
threshold
0.84
override
0.81
enabled
0.81
automatic
0.80
boolean
0.79
mode
0.78
CONFIG
0.78
integer
0.77
defaults
0.76
parity
0.75
Activations Density 0.234%