INDEX
Explanations
words related to emotions, physical sensations, and social interactions
New Auto-Interp
Negative Logits
ppel
-0.65
asketball
-0.63
REE
-0.63
wb
-0.63
FF
-0.59
ificantly
-0.58
Accounting
-0.58
LAND
-0.58
001
-0.57
iolet
-0.56
POSITIVE LOGITS
iest
1.39
afforded
1.03
surrounding
1.02
inherent
1.02
emanating
0.97
iness
0.92
generated
0.91
aspect
0.89
wrought
0.89
plag
0.86
Activations Density 0.563%