INDEX
Explanations
names of individuals, particularly those associated with scientific or medical contexts
New Auto-Interp
Negative Logits
xba
-0.16
xbb
-0.16
xDD
-0.15
xcd
-0.15
xbc
-0.14
xeb
-0.14
dde
-0.14
ë²Į
-0.14
åĽ
-0.14
wdx
-0.14
POSITIVE LOGITS
AF
0.72
MF
0.66
DF
0.65
HF
0.65
AF
0.65
LF
0.64
Af
0.63
af
0.63
BF
0.63
NF
0.63
Activations Density 0.621%