INDEX
Explanations
references to specific brands, products, or locations
New Auto-Interp
Negative Logits
aunts
-0.76
manif
-0.66
flank
-0.64
manifold
-0.62
trap
-0.62
millionaires
-0.61
gui
-0.61
gap
-0.59
ploy
-0.59
asks
-0.59
POSITIVE LOGITS
copyrighted
1.06
trademarks
1.01
copyright
0.98
affiliate
0.94
editorial
0.89
çīĪ
0.87
Disclaimer
0.86
Contribut
0.82
Copyright
0.82
Attribution
0.82
Activations Density 1.734%