INDEX
Explanations
mentions of the word "brothers"
references to siblings or familial relationships, specifically focusing on brothers
New Auto-Interp
Negative Logits
EV
-0.72
Citation
-0.72
ļé
-0.69
alling
-0.66
aminer
-0.65
ISON
-0.64
Lew
-0.64
OLOG
-0.63
Desk
-0.62
ļ
-0.61
POSITIVE LOGITS
hips
1.07
hip
1.06
brothers
1.02
sisters
0.86
heirs
0.83
tones
0.82
hes
0.82
folk
0.82
fol
0.80
hood
0.80
Activations Density 0.024%