INDEX
Explanations
references to the city of Berlin
references to the city of Berlin
New Auto-Interp
Negative Logits
ciating
-0.86
merce
-0.79
cript
-0.78
BOOK
-0.77
DonaldTrump
-0.76
tml
-0.75
apon
-0.74
orses
-0.73
efeated
-0.72
IRED
-0.72
POSITIVE LOGITS
er
0.89
Chancellor
0.88
Berlin
0.85
Munich
0.82
Wall
0.81
Airl
0.81
Blitz
0.79
wings
0.79
furt
0.77
Pact
0.77
Activations Density 0.008%