INDEX
Explanations
references to different areas, such as cities, states, countries, and the world in general, as well as certain specific names and titles
references to prominent journalists and their affiliations
New Auto-Interp
Negative Logits
bruising
-0.66
sim
-0.57
cooldown
-0.56
fres
-0.55
nighttime
-0.54
results
-0.54
advertisement
-0.54
wrinkles
-0.53
flation
-0.53
holiday
-0.52
POSITIVE LOGITS
deserve
1.49
agree
1.23
intend
1.21
realize
1.18
want
1.18
know
1.17
recognize
1.17
understand
1.16
despise
1.16
owe
1.14
Activations Density 0.526%