INDEX
Explanations
occurrences of specific components or attributes relating to items or subjects being listed
New Auto-Interp
Negative Logits
lico
-0.18
pNet
-0.17
zap
-0.16
riz
-0.16
igi
-0.16
PRINTF
-0.15
ropoda
-0.15
ollo
-0.15
antino
-0.14
lero
-0.14
POSITIVE LOGITS
Ir
0.16
rought
0.16
Ir
0.15
ands
0.15
jr
0.15
Ñģо
0.15
rov
0.14
dra
0.14
ku
0.14
_SY
0.14
Activations Density 0.004%