INDEX
Explanations
words related to negative actions or characteristics, such as hate, murder, and suspicion
instances of the word "sn" or variations of it, likely indicating a focus on snarky or sarcastic commentary
New Auto-Interp
Negative Logits
heid
-0.87
EMENT
-0.76
xual
-0.72
WAYS
-0.64
mine
-0.63
PowerPoint
-0.62
minus
-0.61
shire
-0.60
geist
-0.60
Sacrament
-0.59
POSITIVE LOGITS
atching
1.19
obb
1.15
atches
1.13
agged
1.12
agging
1.11
appers
1.09
appy
1.08
apper
1.08
apping
1.07
ugg
1.06
Activations Density 0.009%