INDEX
Explanations
terms related to societal issues and conflicts
phrases with high-frequency conjunctions and the word "and."
New Auto-Interp
Negative Logits
IRE
-0.72
etting
-0.64
united
-0.58
arov
-0.56
fold
-0.54
GO
-0.54
OGR
-0.54
rison
-0.54
HEAD
-0.53
REE
-0.53
POSITIVE LOGITS
the
1.03
the
0.93
});
0.67
Cth
0.67
its
0.67
THE
0.66
those
0.65
ata
0.65
whoever
0.62
The
0.62
Activations Density 0.521%