INDEX
Explanations
specific entities or concepts referred to with the word "RULES"
phrases related to regulations and standards
New Auto-Interp
Negative Logits
du
-0.70
athan
-0.69
issance
-0.66
-0.66
itzer
-0.65
Insight
-0.64
asca
-0.63
aucuses
-0.63
oÄŁan
-0.62
previews
-0.62
POSITIVE LOGITS
forbids
1.02
limits
0.99
governs
0.96
stip
0.94
dictates
0.93
strict
0.92
enforced
0.90
stricter
0.86
Limits
0.86
governing
0.86
Activations Density 0.639%