INDEX
Explanations
locations in Germany, specifically focusing on the city of Hamburg and Cologne
references to specific cities, particularly Hamburg and Cologne
New Auto-Interp
Negative Logits
yip
-0.94
inition
-0.87
wright
-0.84
itton
-0.80
avery
-0.80
osition
-0.76
iries
-0.75
constitu
-0.73
ippery
-0.71
ocally
-0.71
POSITIVE LOGITS
Hamburg
1.13
lar
0.81
urger
0.80
£ı
0.79
Munich
0.77
Dresden
0.73
Bay
0.72
stem
0.72
Marathon
0.71
Hau
0.71
Activations Density 0.013%