INDEX
Explanations
terms related to opinions and organizational entities like unions
variations of the word "opinion" and related terms
New Auto-Interp
Negative Logits
weights
-0.70
floor
-0.66
fabrication
-0.64
ten
-0.63
abases
-0.61
\\\\\\\\\\\\\\\\
-0.59
cases
-0.59
ropolis
-0.58
WT
-0.57
Tanz
-0.56
POSITIVE LOGITS
hip
1.00
piracy
0.99
pread
0.98
hips
0.98
ervatives
0.97
naire
0.94
hov
0.94
ervative
0.93
ystem
0.92
chool
0.86
Activations Density 0.050%