INDEX
Explanations
expressions related to time, particularly phrases that mention 'day to day' activities
phrases related to daily routines and temporal references
New Auto-Interp
Negative Logits
Squ
-0.81
Alternatively
-0.65
went
-0.62
allowed
-0.62
Zip
-0.60
antha
-0.60
Tan
-0.60
OUNT
-0.60
BALL
-0.60
Yellow
-0.59
POSITIVE LOGITS
metab
0.68
antics
0.62
insults
0.62
extent
0.61
umatic
0.60
motions
0.60
outburst
0.59
dismiss
0.59
heimer
0.59
venge
0.58
Activations Density 0.091%