INDEX
Explanations
the word 'colleague'
mentions of colleagues or coworkers
references to colleagues or teammates
New Auto-Interp
Negative Logits
itters
-0.93
avers
-0.86
ingers
-0.85
etting
-0.84
liga
-0.83
uits
-0.80
cale
-0.80
olia
-0.79
ourcing
-0.78
aida
-0.78
POSITIVE LOGITS
colleague
0.95
classmate
0.84
Ally
0.83
nered
0.81
comrade
0.78
laureate
0.78
lier
0.74
Laure
0.73
Prosper
0.71
cowork
0.71
Activations Density 0.032%