INDEX
Explanations
references to changes over time and advancements in technology
New Auto-Interp
Negative Logits
olumn
-0.17
รà¸Ķ
-0.16
anki
-0.15
herit
-0.14
irsch
-0.14
histor
-0.14
omba
-0.14
edia
-0.14
heritage
-0.13
á»±c
-0.13
POSITIVE LOGITS
intervening
0.25
changed
0.24
change
0.22
changed
0.21
changes
0.20
change
0.19
Changed
0.19
åıĺåĮĸ
0.19
chang
0.18
-change
0.18
Activations Density 0.110%