INDEX
Explanations
references to a specific sports team named "Kings."
references to the "Kings" within various contexts
New Auto-Interp
Negative Logits
resil
-0.81
orno
-0.77
filament
-0.71
afe
-0.70
balloon
-0.69
malf
-0.66
abe
-0.66
iste
-0.65
omething
-0.65
AAAA
-0.65
POSITIVE LOGITS
Kings
4.04
Kings
3.13
kings
1.92
King
1.56
King
1.49
Rams
1.38
Devils
1.31
Knights
1.30
Kingdoms
1.27
Queens
1.24
Activations Density 0.010%