INDEX
Explanations
words related to defiance or going against established norms or rules
occurrences of the word "def," likely indicating definitions or actions related to defining something
New Auto-Interp
Negative Logits
DAY
-0.72
Madness
-0.71
Boll
-0.70
sth
-0.70
Hour
-0.65
Elves
-0.63
Archdemon
-0.61
Reviewer
-0.61
clip
-0.60
livest
-0.60
POSITIVE LOGITS
ensible
1.32
erence
1.24
ibr
1.22
acement
1.21
erent
1.19
ected
1.12
ocused
1.12
aced
1.09
ection
1.09
amiliar
1.08
Activations Density 0.015%