INDEX
Explanations
occurrences of the word "select" and various forms of the word "viving" (such as "living")
New Auto-Interp
Negative Logits
Tort
-0.18
rai
-0.16
lun
-0.15
:"-"`↵
-0.14
epend
-0.14
alloc
-0.14
Torres
-0.14
acas
-0.14
Kurul
-0.14
armor
-0.14
POSITIVE LOGITS
hw
0.16
etros
0.15
Bret
0.14
roit
0.14
retro
0.14
roat
0.14
ẽ
0.13
etrofit
0.13
oland
0.13
etro
0.13
Activations Density 0.007%