INDEX
Explanations
differences in opinions or beliefs between different groups
references to varying perspectives or opinions held by different groups of people
New Auto-Interp
Negative Logits
td
-0.66
\/
-0.62
Donation
-0.61
Guru
-0.59
Conc
-0.59
before
-0.58
Complex
-0.58
advertisement
-0.57
BEFORE
-0.57
-+
-0.56
POSITIVE LOGITS
succumb
0.80
enes
0.78
rail
0.75
prefer
0.68
resent
0.65
idav
0.64
simply
0.64
rast
0.63
ports
0.63
succumbed
0.62
Activations Density 0.154%