INDEX
Explanations
phrases related to strong emotions
intense emotional experiences and societal structures
New Auto-Interp
Negative Logits
zens
-0.63
pport
-0.62
ãĥķãĤ¡
-0.61
erenn
-0.56
Prepare
-0.55
ãĤ
-0.54
Profession
-0.53
Pacific
-0.53
GAN
-0.53
recent
-0.52
POSITIVE LOGITS
consisted
1.59
lasted
1.52
remained
1.42
seemed
1.41
depended
1.37
tended
1.35
differed
1.35
didnt
1.29
resembled
1.29
lacked
1.26
Activations Density 0.600%