INDEX
Explanations
instances of the word "ple" or its variations, indicating a focus on pleasure-related concepts
New Auto-Interp
Negative Logits
ERA
-0.77
Kenobi
-0.76
LER
-0.75
uador
-0.73
quickShipAvailable
-0.72
~~~~
-0.70
NRS
-0.70
lar
-0.69
artifacts
-0.68
Versions
-0.68
POSITIVE LOGITS
teness
1.33
asure
0.97
thora
0.97
mented
0.95
xt
0.86
ASE
0.84
ple
0.82
asures
0.82
ments
0.81
ased
0.81
Activations Density 0.006%