INDEX
Explanations
welcome messages or introductory phrases
phrases that include a welcoming expression or greeting
New Auto-Interp
Negative Logits
river
-0.75
negie
-0.75
appropri
-0.71
ariat
-0.70
rift
-0.68
nutrition
-0.64
orius
-0.64
arist
-0.64
suggest
-0.63
onel
-0.63
POSITIVE LOGITS
giving
0.92
elcome
0.89
Welcome
0.84
prise
0.84
Welcome
0.81
ISTER
0.78
Guest
0.73
Surprise
0.73
prises
0.72
ISSION
0.72
Activations Density 0.013%