INDEX
Explanations
greeting-related words and phrases
phrases related to greetings and welcoming interactions
New Auto-Interp
Negative Logits
negie
-0.78
cano
-0.70
proxy
-0.66
moral
-0.65
ulz
-0.64
Gray
-0.64
criminal
-0.64
anyon
-0.63
inhibitor
-0.62
gur
-0.61
POSITIVE LOGITS
greet
1.07
greeting
1.02
enance
0.89
greets
0.86
issance
0.85
eering
0.83
politely
0.81
greeted
0.81
goodbye
0.80
reetings
0.78
Activations Density 0.038%