INDEX
Explanations
mentions of the letter 'K' and other textual elements in a structured format
New Auto-Interp
Negative Logits
emale
-0.15
DOI
-0.15
alach
-0.15
edException
-0.14
ighb
-0.14
tes
-0.14
239
-0.14
ynos
-0.14
ends
-0.14
shr
-0.14
POSITIVE LOGITS
xe
0.26
xf
0.23
xc
0.23
xd
0.23
xa
0.21
xb
0.18
c
0.15
sanct
0.15
5
0.14
imum
0.14
Activations Density 0.000%