INDEX
Explanations
phrases describing a specific time or period
phrases that describe a specific temporal context
New Auto-Interp
Negative Logits
rolet
-0.75
ply
-0.74
agin
-0.68
ha
-0.67
andal
-0.67
urally
-0.66
gan
-0.64
byte
-0.60
NOTE
-0.60
bas
-0.60
POSITIVE LOGITS
soever
1.10
abouts
0.97
irlf
0.74
upon
0.71
ornia
0.71
faced
0.69
earch
0.68
inelli
0.67
they
0.67
discussing
0.67
Activations Density 0.060%