INDEX
Explanations
references to a specific plant
New Auto-Interp
Negative Logits
erli
-0.17
aired
-0.16
eru
-0.16
rq
-0.15
fdb
-0.15
icts
-0.15
mada
-0.15
apan
-0.15
erah
-0.15
ras
-0.15
POSITIVE LOGITS
bons
0.22
bles
0.21
onacci
0.20
bage
0.20
rahim
0.19
bett
0.19
ber
0.18
ele
0.18
bole
0.18
bler
0.17
Activations Density 0.021%