INDEX
Explanations
phrases related to rules, regulations, and enforcement
content related to authority and regulations
New Auto-Interp
Negative Logits
neath
-0.68
Registered
-0.64
nutshell
-0.64
âĵĺ
-0.63
llers
-0.62
Trilogy
-0.61
Twins
-0.59
nova
-0.58
Kinnikuman
-0.58
=""
-0.58
POSITIVE LOGITS
deemed
1.03
deem
0.87
warranted
0.85
threat
0.84
deems
0.84
jeopard
0.83
threatened
0.83
threats
0.83
endanger
0.83
reasonably
0.81
Activations Density 0.747%