INDEX
Explanations
references to the "Game of Thrones" franchise and associated elements
New Auto-Interp
Negative Logits
Å¡tÃŃ
-0.17
dle
-0.16
idge
-0.16
ILLE
-0.15
.si
-0.15
ecz
-0.15
кÑĥÑĢ
-0.15
HING
-0.15
erty
-0.14
ids
-0.14
POSITIVE LOGITS
rones
0.27
th
0.26
Thrones
0.23
drones
0.20
throne
0.18
rone
0.17
drone
0.17
chairs
0.17
Chairs
0.16
crow
0.16
Activations Density 0.008%