INDEX
Explanations
words related to physical activities or sensations, such as breathing, sighing, and feeling relief
words related to relaxation or fatigue
New Auto-Interp
Negative Logits
channels
-0.73
ahon
-0.71
rued
-0.70
allas
-0.70
folios
-0.68
Emin
-0.67
rity
-0.66
uve
-0.65
avenues
-0.65
orate
-0.65
POSITIVE LOGITS
ufact
0.89
vironment
0.81
goodbye
0.80
atic
0.76
¯¯¯¯
0.76
puter
0.75
oleon
0.75
ãĤ´ãĥ³
0.74
itual
0.72
atical
0.71
Activations Density 0.021%