INDEX
Explanations
phrases related to spreading information or messages to others
phrases related to disseminating information
New Auto-Interp
Negative Logits
antry
-0.82
etary
-0.63
onding
-0.63
ļéĨĴ
-0.61
ski
-0.58
orts
-0.58
ishes
-0.57
cients
-0.57
efe
-0.56
ORTS
-0.56
POSITIVE LOGITS
fitted
1.07
wards
0.93
sole
0.92
stretched
0.91
loud
0.89
smart
0.88
skirts
0.83
lived
0.83
bur
0.81
Loud
0.79
Activations Density 0.085%