INDEX
Explanations
terms related to physiological conditions or processes
New Auto-Interp
Negative Logits
ible
-0.81
ipeg
-0.80
izen
-0.80
itarian
-0.74
ifiable
-0.73
ifier
-0.70
Leilan
-0.69
ariat
-0.66
Guilty
-0.66
ices
-0.64
POSITIVE LOGITS
lla
0.98
lli
0.91
lly
0.88
tes
0.88
nces
0.84
nce
0.83
tic
0.83
xual
0.82
ll
0.82
ttes
0.82
Activations Density 0.011%