INDEX
Explanations
words related to social or political issues and actions, especially those with negative connotations
terms associated with anti-arguments or opposition to various issues
New Auto-Interp
Negative Logits
Rue
-0.70
dots
-0.66
Mock
-0.65
KNOWN
-0.65
staking
-0.62
STATS
-0.62
McH
-0.61
notebooks
-0.60
mit
-0.59
Nun
-0.59
POSITIVE LOGITS
usterity
0.99
roleum
0.86
otic
0.86
otics
0.84
byter
0.80
aphael
0.79
ilib
0.78
osher
0.78
amacare
0.76
ucl
0.76
Activations Density 0.132%