INDEX
Explanations
mentions of a particular name or entity
instances of the word "Cap" or related variations, suggesting a focus on capitalization or specific entities represented by "Cap"
New Auto-Interp
Negative Logits
hower
-0.95
ĪĴ
-0.86
¿½
-0.84
silence
-0.79
cause
-0.74
perse
-0.70
hood
-0.69
MSM
-0.64
anten
-0.64
td
-0.63
POSITIVE LOGITS
itol
1.43
itals
1.23
acity
1.19
illary
1.19
rice
1.17
uchin
1.09
rices
1.06
abilities
1.06
itan
1.00
ulet
0.99
Activations Density 0.017%