INDEX
Explanations
references to heat and temperature-related phenomena
New Auto-Interp
Negative Logits
ing
-0.21
hood
-0.20
ìķ¼
-0.16
ately
-0.16
eniable
-0.15
openh
-0.14
sut
-0.14
eso
-0.14
som
-0.14
erator
-0.14
POSITIVE LOGITS
stroke
0.19
/fire
0.18
rice
0.16
ronic
0.16
argon
0.16
ÑĦик
0.15
stile
0.15
illac
0.15
ilda
0.15
teg
0.15
Activations Density 0.026%