INDEX
Explanations
phrases mentioning specific groups or organizations
phrases that involve membership in a group or organization
New Auto-Interp
Negative Logits
nipples
-0.74
consistency
-0.73
nost
-0.67
ONSORED
-0.65
centerpiece
-0.65
distractions
-0.64
concess
-0.64
proportions
-0.63
references
-0.62
fumes
-0.62
POSITIVE LOGITS
Parliament
0.83
parliament
0.82
Congress
0.81
odox
0.80
congress
0.76
errilla
0.74
amily
0.71
House
0.69
society
0.68
ãĥĦ
0.67
Activations Density 0.087%