INDEX
Explanations
phrases and words related to the impact of actions on a collective or societal level
New Auto-Interp
Negative Logits
minster
-0.73
thal
-0.66
ishly
-0.65
ãĥĥãĥī
-0.64
corn
-0.63
iously
-0.62
eps
-0.60
ragon
-0.60
phony
-0.60
glass
-0.59
POSITIVE LOGITS
however
1.01
though
0.82
insofar
0.77
we
0.76
let
0.76
there
0.75
moreover
0.74
they
0.72
lest
0.71
remember
0.69
Activations Density 0.053%