INDEX
Explanations
mentions of welcoming and introductions in a text
instances of the phrase "Welcome to."
New Auto-Interp
Negative Logits
emitted
-0.71
loo
-0.69
scales
-0.67
okia
-0.67
ebted
-0.66
exerted
-0.65
liga
-0.64
etooth
-0.64
hai
-0.64
relied
-0.63
POSITIVE LOGITS
asty
0.72
Paradise
0.68
Ô
0.68
Bast
0.67
Eve
0.65
Citizens
0.65
Fairy
0.64
yles
0.64
clus
0.64
HIP
0.64
Activations Density 0.063%