INDEX
Explanations
mentions of authority figures and institutions related to law or governance
New Auto-Interp
Negative Logits
tera
-0.16
.scalablytyped
-0.16
obble
-0.16
ÙĩÙĨÚ¯
-0.15
orage
-0.15
.Proxy
-0.15
üny
-0.15
.bundle
-0.14
'gc
-0.14
.BLL
-0.14
POSITIVE LOGITS
(!
0.23
(!
0.22
iel
0.19
!
0.19
instead
0.19
(!!
0.16
no
0.16
ronic
0.16
Specifically
0.16
.
0.15
Activations Density 0.273%