INDEX
Explanations
words related to emotions and behaviors, particularly negative emotions like mistrust, anger, and resentment, as well as positive emotions like gratitude and enthusiasm
concepts related to trust, accountability, and emotional states in various contexts
New Auto-Interp
Negative Logits
pload
-0.78
aceae
-0.72
kernel
-0.71
iannopoulos
-0.65
iann
-0.65
abet
-0.65
otin
-0.65
iov
-0.63
adobe
-0.63
orno
-0.59
POSITIVE LOGITS
alike
1.41
thereof
0.91
respectively
0.86
fulness
0.73
overcame
0.69
amongst
0.68
depending
0.65
versa
0.64
consequ
0.63
emanating
0.62
Activations Density 0.304%