INDEX
Explanations
phrases related to offering help or support
expressions of welcoming and inclusivity
New Auto-Interp
Negative Logits
asus
-0.51
Awareness
-0.45
cano
-0.45
Centers
-0.45
gemony
-0.43
uesday
-0.43
eatured
-0.43
cour
-0.41
liament
-0.41
efe
-0.41
POSITIVE LOGITS
to
1.19
.
1.07
!.
1.03
;)
1.01
thereto
1.00
:)
0.95
!
0.91
TO
0.91
.'
0.89
.:
0.89
Activations Density 0.474%