INDEX
Explanations
phrases related to care, particularly in a caregiving or health context
New Auto-Interp
Negative Logits
myſelf
-0.88
auffi
-0.88
themſelves
-0.88
Monfieur
-0.87
ConstraintMaker
-0.81
ſeveral
-0.80
himſelf
-0.79
pleaſure
-0.79
ſhe
-0.79
Brescia
-0.79
POSITIVE LOGITS
call
0.63
app
0.58
Made
0.58
I
0.53
Glad
0.51
care
0.49
hat
0.49
Glad
0.46
I
0.46
care
0.45
Activations Density 0.227%