INDEX
Explanations
phrases related to the passage of time or historical events
the phrase "ever since" followed by a time reference
New Auto-Interp
Negative Logits
================================
-0.71
Es
-0.70
Auto
-0.70
OH
-0.68
emo
-0.66
Console
-0.63
Fight
-0.62
Bonus
-0.60
âĹ¼
-0.60
abus
-0.59
POSITIVE LOGITS
rely
1.07
theless
0.84
ĸļ
0.77
Seym
0.70
NESS
0.69
inception
0.66
Senegal
0.66
Keller
0.65
adolescence
0.63
rero
0.62
Activations Density 0.021%