INDEX
Explanations
legal terms and court case related words
references to legal restrictions or censorship
New Auto-Interp
Negative Logits
xual
-0.69
Hurricanes
-0.69
Blessed
-0.66
earned
-0.63
cession
-0.62
signed
-0.61
supp
-0.61
paying
-0.60
guided
-0.60
Fault
-0.60
POSITIVE LOGITS
gag
1.30
glers
1.22
reel
0.80
reflex
0.79
eries
0.78
gey
0.77
weed
0.74
zeb
0.73
bags
0.71
bag
0.71
Activations Density 0.002%