INDEX
Explanations
phrases related to feelings, opinions, and actions of individuals
expressions of emotions and opinions
New Auto-Interp
Negative Logits
auga
-0.77
teness
-0.67
livion
-0.66
ousing
-0.62
entirety
-0.62
iciency
-0.61
txt
-0.61
fty
-0.60
hner
-0.59
gart
-0.59
POSITIVE LOGITS
compared
0.91
interacts
0.83
differs
0.80
interacting
0.80
affects
0.79
compares
0.78
versus
0.77
nowadays
0.74
geographically
0.73
differently
0.72
Activations Density 0.153%