INDEX
Explanations
expressions of emotions and opinions
New Auto-Interp
Negative Logits
Reputation
-0.17
rimon
-0.15
iasco
-0.14
rms
-0.14
idable
-0.14
rijk
-0.14
egrity
-0.14
REM
-0.13
.ribbon
-0.13
parçası
-0.13
POSITIVE LOGITS
concern
0.26
concerns
0.23
opinion
0.21
willingness
0.20
Concern
0.19
intent
0.18
support
0.18
interest
0.18
desire
0.18
belief
0.17
Activations Density 0.050%