INDEX
Explanations
phrases and terms related to expressing opinions or beliefs
references to differing opinions or perspectives
New Auto-Interp
Negative Logits
wealth
-0.73
cial
-0.73
trap
-0.70
ursed
-0.67
ufact
-0.66
cially
-0.65
breaking
-0.65
period
-0.64
cold
-0.64
dry
-0.64
POSITIVE LOGITS
views
1.14
guiActiveUn
1.01
viewpoints
0.92
opinions
0.85
ports
0.84
terday
0.83
viewpoint
0.79
opin
0.76
Views
0.73
yip
0.73
Activations Density 0.009%