INDEX
Explanations
mentions of the word "ink" plus a number after
instances of the word "ink" in various contexts
New Auto-Interp
Negative Logits
ãĥ£
-0.72
compensate
-0.72
ccording
-0.71
contend
-0.66
trave
-0.66
captcha
-0.65
blance
-0.63
foundation
-0.63
basis
-0.63
ppa
-0.62
POSITIVE LOGITS
ink
1.01
erman
0.99
edin
0.99
erm
0.91
overs
0.91
ers
0.89
oop
0.86
ners
0.85
ering
0.84
glers
0.82
Activations Density 0.007%