INDEX
Explanations
references to companionship or social connections
"with" followed by a family/friend related word
social relationships with people
New Auto-Interp
Negative Logits
houſe
-0.56
ſelf
-0.52
ſever
-0.50
themſelves
-0.50
orini
-0.48
vidia
-0.48
Créditos
-0.48
ſch
-0.46
pleaſure
-0.46
leaſt
-0.46
POSITIVE LOGITS
friends
1.25
buddies
1.06
pals
1.04
fellow
1.04
friends
1.04
friend
1.02
colleagues
0.97
companions
0.95
φί
0.93
friend
0.90
Activations Density 0.171%