INDEX
Explanations
phrases involving specific interactions with characters or specific activities
New Auto-Interp
Negative Logits
ovich
-0.74
ende
-0.72
omsky
-0.72
etary
-0.69
conclud
-0.69
.–
-0.67
pring
-0.67
fair
-0.66
venant
-0.66
bia
-0.65
POSITIVE LOGITS
friends
1.16
stood
1.16
coworkers
1.07
pals
1.05
colleagues
1.03
strangers
0.98
classmates
0.97
buddies
0.96
fellow
0.96
teammates
0.95
Activations Density 0.190%