INDEX
Explanations
specific product descriptions related to various items
references to products or items being described positively
New Auto-Interp
Negative Logits
Jews
-0.88
rollment
-0.86
redits
-0.85
termination
-0.81
aucuses
-0.79
igration
-0.79
Muslims
-0.77
Players
-0.76
Dialogue
-0.76
ugu
-0.76
POSITIVE LOGITS
versatile
1.15
product
1.13
inexpensive
1.11
thing
1.11
kit
1.09
sturdy
1.09
item
1.06
homemade
1.01
sleek
1.01
sucker
0.99
Activations Density 0.219%