INDEX
Explanations
themes of emotional well-being and interpersonal connections
New Auto-Interp
Negative Logits
colorful
-0.18
äs
-0.15
synchronize
-0.15
globalization
-0.15
dere
-0.14
colors
-0.14
Savior
-0.14
initialization
-0.14
synchronization
-0.14
realization
-0.14
POSITIVE LOGITS
Embed
0.19
kindness
0.17
embedding
0.17
kind
0.17
KIND
0.17
/embed
0.16
spotting
0.16
Zy
0.16
-kind
0.15
decent
0.15
Activations Density 0.002%