INDEX
Explanations
repetitive structures or patterns in phrases
New Auto-Interp
Negative Logits
iesel
-0.21
amak
-0.15
unj
-0.15
isoft
-0.14
odash
-0.14
út
-0.14
_exempt
-0.14
ught
-0.14
arpa
-0.14
ะ
-0.14
POSITIVE LOGITS
ifications
0.26
ifi
0.24
ifiable
0.23
ifying
0.22
ifies
0.19
ification
0.19
ified
0.18
IFI
0.17
sand
0.16
s
0.16
Activations Density 0.113%