INDEX
Explanations
proper nouns related to competitions
the presence of specific names or terms related to characters or entities in a context
New Auto-Interp
Negative Logits
hered
-0.78
growth
-0.71
hemat
-0.71
ritic
-0.68
Downloadha
-0.65
PROV
-0.64
shr
-0.63
apple
-0.62
ãģķ
-0.60
Scotland
-0.58
POSITIVE LOGITS
uel
1.31
theless
1.07
oaded
0.85
onge
0.83
ength
0.82
destro
0.80
inations
0.79
inus
0.78
iel
0.77
ocity
0.77
Activations Density 0.004%