INDEX
Explanations
the word "plain" with high activation values, possibly focusing on contexts related to simplicity or directness
instances of the word "plain" and its variations in different contexts
New Auto-Interp
Negative Logits
otos
-0.90
etheus
-0.83
yip
-0.82
allery
-0.77
glomer
-0.76
lasses
-0.74
bucks
-0.73
interstitial
-0.73
entric
-0.72
alez
-0.71
POSITIVE LOGITS
plain
1.08
text
1.06
sheet
0.97
plain
0.96
sheets
0.92
rolled
0.89
cloth
0.87
vanilla
0.87
\\\\\\\\
0.84
ified
0.82
Activations Density 0.018%