INDEX
Explanations
phrases related to the action of spreading information or knowledge
phrases related to dissemination or communication
New Auto-Interp
Negative Logits
deen
-0.77
*/(
-0.75
Zup
-0.73
clamation
-0.68
herty
-0.67
--+
-0.66
Starship
-0.65
quo
-0.65
tarians
-0.63
Halls
-0.61
POSITIVE LOGITS
sheets
1.85
sheet
1.41
shirt
0.93
misinformation
0.91
pread
0.83
spreads
0.82
awareness
0.81
spread
0.80
disinformation
0.80
across
0.80
Activations Density 0.046%