INDEX
Explanations
phrases indicating announcements or declarations of pleasure or happiness
New Auto-Interp
Negative Logits
comprehension
-0.72
ById
-0.71
anooga
-0.68
aversion
-0.66
soDeliveryDate
-0.65
puted
-0.61
VERTISEMENT
-0.59
Wise
-0.58
rarily
-0.56
subur
-0.55
POSITIVE LOGITS
announce
1.39
introduce
1.08
hear
1.07
endorse
1.01
welcome
1.00
invite
0.99
see
0.98
unveil
0.97
join
0.96
participate
0.95
Activations Density 0.073%