INDEX
Explanations
words indicating strong emphasis or negation, such as 'EVERYTHING', 'VERY', 'NOT', 'NEVER', 'ALWAYS'
emphatic negations and the word "every."
New Auto-Interp
Negative Logits
idem
-0.80
Slovenia
-0.75
ipel
-0.73
atz
-0.73
ijk
-0.72
opol
-0.72
orio
-0.71
Cheong
-0.70
gio
-0.69
idable
-0.68
POSITIVE LOGITS
THING
1.32
MUCH
1.21
NOT
1.17
FUCK
1.16
THERE
1.14
WITH
1.14
ELY
1.13
ONE
1.13
WAY
1.10
THEN
1.10
Activations Density 0.051%