INDEX
Explanations
phrases related to time and duration
New Auto-Interp
Negative Logits
old
-0.16
stag
-0.16
nd
-0.16
ogle
-0.15
ing
-0.15
/is
-0.15
ional
-0.15
orry
-0.14
stile
-0.14
ngr
-0.14
POSITIVE LOGITS
-HT
0.28
teenth
0.27
teen
0.25
де
0.24
th
0.21
-star
0.20
Thirty
0.19
bread
0.19
tte
0.18
ylim
0.18
Activations Density 0.178%