INDEX
Explanations
affirmations of well-being and concern for others' health
New Auto-Interp
Negative Logits
oons
-0.17
Kah
-0.15
olders
-0.15
836
-0.15
redo
-0.14
.cons
-0.14
Pleasant
-0.14
ohn
-0.14
ondo
-0.14
é¾Ħ
-0.14
POSITIVE LOGITS
fine
0.39
OK
0.36
ok
0.35
okay
0.34
Fine
0.31
OK
0.31
fine
0.31
okay
0.30
_OK
0.29
Ok
0.29
Activations Density 0.203%