INDEX
Explanations
concepts related to medical treatments and their effectiveness
New Auto-Interp
Negative Logits
indi
-0.07
iyim
-0.07
Dün
-0.07
еÑĢÑĪ
-0.07
undred
-0.07
ernen
-0.06
plnÄĽ
-0.06
Å¥
-0.06
дво
-0.06
å·¨
-0.06
POSITIVE LOGITS
most
0.26
majority
0.24
most
0.18
ëĮĢë¶Ģë¶Ħ
0.17
Most
0.17
sometimes
0.17
Majority
0.17
болÑĮÑĪин
0.16
MOST
0.16
often
0.16
Activations Density 0.293%