INDEX
Explanations
dates and years related to events or decisions
New Auto-Interp
Negative Logits
ave
-0.15
spring
-0.15
671
-0.14
avern
-0.14
portion
-0.14
tractor
-0.14
Pole
-0.14
raction
-0.14
ici
-0.13
eeper
-0.13
POSITIVE LOGITS
LATED
0.17
cite
0.16
celik
0.15
mie
0.14
194
0.14
ledo
0.14
fos
0.14
bench
0.14
198
0.14
_pdf
0.14
Activations Density 0.067%