INDEX
Explanations
instances of demonstrations or protests
instances of the word "demonstration" and its variations in the text
New Auto-Interp
Negative Logits
paste
-0.72
ecd
-0.70
saline
-0.67
oho
-0.66
efe
-0.66
pex
-0.65
seed
-0.64
otto
-0.63
onen
-0.62
bid
-0.62
POSITIVE LOGITS
demonstration
0.83
GOODMAN
0.82
antes
0.78
demonstrations
0.77
demonstrators
0.76
ary
0.75
emonium
0.73
stration
0.73
itzer
0.69
anooga
0.69
Activations Density 0.031%