INDEX
Explanations
instances of the word "thin" and its variations, indicating a focus on thinness
New Auto-Interp
Negative Logits
eb
-0.17
ãĥ¼ãĥ
-0.16
ains
-0.16
ufen
-0.16
ein
-0.15
inous
-0.15
vil
-0.15
iences
-0.15
ean
-0.14
hem
-0.14
POSITIVE LOGITS
ning
0.39
NING
0.26
ners
0.23
ening
0.23
/th
0.22
slice
0.21
slices
0.20
ness
0.20
gauge
0.20
kest
0.19
Activations Density 0.025%