INDEX
Explanations
the word related to care or caregiving responsibilities
New Auto-Interp
Head Attr Weights
0:0.07
1:0.06
2:0.08
3:0.08
4:0.08
5:0.09
6:0.08
7:0.07
8:0.08
9:0.09
10:0.09
11:0.09
Negative Logits
[-
-1.64
jet
-1.54
confiscated
-1.39
Droid
-1.29
Construction
-1.29
noticed
-1.29
flow
-1.28
hots
-1.28
lift
-1.27
processor
-1.22
POSITIVE LOGITS
olkien
1.81
ADRA
1.75
ilial
1.61
iasis
1.57
oult
1.50
UTH
1.50
ourke
1.45
selves
1.45
entimes
1.45
��
1.39
Activations Density 0.000%