INDEX
Explanations
concepts related to scientific classification and taxonomy
New Auto-Interp
Negative Logits
6
-0.23
3
-0.23
2
-0.22
5
-0.22
1
-0.22
.
-0.22
8
-0.22
,
-0.21
10
-0.21
11
-0.20
POSITIVE LOGITS
lerle
0.23
heits
0.23
ungs
0.23
ounter
0.21
itä
0.21
ensch
0.20
erver
0.20
istent
0.20
enderror
0.20
üstü
0.20
Activations Density 0.049%