INDEX
Explanations
words associated with temperature or popularity, specifically the term "hot."
New Auto-Interp
Negative Logits
.scalablytyped
-0.18
sut
-0.16
jee
-0.15
ously
-0.15
arian
-0.15
ately
-0.15
磨
-0.14
hood
-0.14
atchet
-0.14
Ùħ
-0.14
POSITIVE LOGITS
-hot
0.23
spots
0.18
elper
0.18
-blood
0.17
endar
0.16
hotter
0.15
imir
0.15
ly
0.15
/fire
0.15
äºİ
0.14
Activations Density 0.019%