INDEX
Explanations
instances of people being pleased with different situations or events
instances of satisfaction or approval
New Auto-Interp
Negative Logits
opic
-0.88
amen
-0.74
çīĪ
-0.72
ngth
-0.69
stove
-0.69
opers
-0.66
ocular
-0.65
ammy
-0.65
arin
-0.64
ut
-0.64
POSITIVE LOGITS
ienced
0.77
iated
0.72
joy
0.71
Shad
0.71
vale
0.67
ragon
0.67
liness
0.67
issance
0.66
onlook
0.65
jad
0.65
Activations Density 0.041%