INDEX
Explanations
the word "Product" with varying degrees of specificity indicated by different activation values
occurrences of the word "Product" and its variants
New Auto-Interp
Negative Logits
laus
-0.74
trop
-0.66
Peninsula
-0.66
CLASSIFIED
-0.66
asons
-0.65
apo
-0.65
Brother
-0.61
Nights
-0.61
Electricity
-0.61
Shades
-0.60
POSITIVE LOGITS
ivity
1.41
ively
1.26
ivities
1.08
iveness
1.03
packaging
0.85
Hunt
0.81
icons
0.77
rador
0.76
kit
0.75
ocratic
0.75
Activations Density 0.029%