INDEX
Explanations
instances of the word "can" indicating capability or potential
New Auto-Interp
Negative Logits
verts
-0.17
isoft
-0.15
éro
-0.15
ukt
-0.15
èij
-0.15
ILA
-0.14
itself
-0.14
uye
-0.14
ailing
-0.14
erro
-0.14
POSITIVE LOGITS
imagine
0.22
picture
0.22
v
0.20
understand
0.20
truth
0.19
imaging
0.19
fran
0.19
image
0.18
tell
0.18
GU
0.18
Activations Density 0.051%