INDEX
Explanations
words related to exaggerated attention or promotion
mentions of "hype" and legal terms such as "injunction."
New Auto-Interp
Negative Logits
diplomacy
-0.73
itime
-0.71
Apprentice
-0.65
inking
-0.64
lected
-0.63
aunch
-0.63
floats
-0.62
fully
-0.62
ampunk
-0.61
spheres
-0.61
POSITIVE LOGITS
fuss
1.48
jee
1.42
hype
1.42
closure
1.28
izu
1.26
stigma
1.22
injunction
1.21
tree
1.17
KR
1.16
icted
1.10
Activations Density 0.040%