INDEX
Explanations
words related to types or characteristics of bacteria
New Auto-Interp
Negative Logits
er
-0.35
i
-0.32
le
-0.27
m
-0.21
Ùī
-0.21
yor
-0.20
iou
-0.19
ar
-0.19
y
-0.19
l
-0.18
POSITIVE LOGITS
osoph
0.21
ox
0.21
antro
0.21
adelphia
0.20
houette
0.20
teenth
0.19
osopher
0.18
waukee
0.18
ypad
0.18
lico
0.18
Activations Density 0.053%