INDEX
Explanations
specific scientific terms and concepts related to biology and health
New Auto-Interp
Negative Logits
aarrggbb
-1.01
greateſt
-0.97
leaſt
-0.94
myſelf
-0.89
ſeveral
-0.88
ſmall
-0.87
ſelves
-0.87
ſou
-0.87
ſelf
-0.86
ſte
-0.86
POSITIVE LOGITS
.
0.57
,
0.53
at
0.48
la
0.43
WHEREAS
0.43
令
0.43
(
0.43
est
0.42
y
0.42
emptyList
0.41
Activations Density 1.469%