INDEX
Explanations
terms related to medical conditions and treatments
New Auto-Interp
Negative Logits
oubliez
-0.46
úrese
-0.45
featureID
-0.43
both
-0.43
Ak
-0.41
-0.41
xtures
-0.40
God
-0.40
not
-0.40
top
-0.39
POSITIVE LOGITS
itſelf
0.93
myſelf
0.83
مرئيه
0.81
themſelves
0.81
ordinaria
0.77
himſelf
0.76
elsewhere
0.74
againſt
0.72
Reſ
0.72
ſelf
0.72
Activations Density 1.486%