INDEX
Explanations
words related to the effectiveness or efficacy of treatments and medical practices
New Auto-Interp
Negative Logits
pper
-0.71
ODE
-0.70
Wonderland
-0.68
Pic
-0.67
dit
-0.65
houn
-0.65
Federation
-0.62
Institution
-0.62
hak
-0.61
Brotherhood
-0.60
POSITIVE LOGITS
iveness
1.20
iencies
1.06
effectiveness
1.00
fulness
0.96
acies
0.94
acy
0.92
tremend
0.91
abilities
0.90
ality
0.85
ulence
0.85
Activations Density 0.011%