INDEX
Explanations
phrases concerning temperature and weather conditions
New Auto-Interp
Negative Logits
sunshine
-0.18
sunny
-0.17
plode
-0.16
Sunshine
-0.15
lax
-0.15
Hot
-0.15
Mills
-0.14
hot
-0.14
otch
-0.14
sunlight
-0.14
POSITIVE LOGITS
cold
0.45
colder
0.41
cold
0.40
Cold
0.40
chilly
0.39
Cold
0.38
åĨ·
0.34
Ñħолод
0.32
winter
0.30
å¯Ĵ
0.30
Activations Density 0.125%