INDEX
Explanations
phrases related to social commentary and rhetoric
New Auto-Interp
Negative Logits
revis
-0.21
salv
-0.18
tlement
-0.17
igation
-0.16
ÑħÑĸд
-0.16
ideographic
-0.16
elic
-0.15
ÙĪÛĮÛĮ
-0.15
enders
-0.15
ules
-0.15
POSITIVE LOGITS
ize
0.29
ify
0.28
inize
0.24
itize
0.23
ise
0.23
inate
0.22
elize
0.22
isify
0.21
minate
0.21
iliate
0.21
Activations Density 0.239%