INDEX
Explanations
phrases indicating clinical outcomes and the effectiveness of medical treatments
New Auto-Interp
Negative Logits
burg
-0.55
//
-0.52
ma
-0.47
lest
-0.45
vov
-0.44
raud
-0.44
rax
-0.43
BURG
-0.43
っけ
-0.42
tranquillo
-0.41
POSITIVE LOGITS
improvements
0.99
effects
0.94
improvement
0.93
measurable
0.93
effects
0.91
Effects
0.88
Improvements
0.87
Effects
0.86
effect
0.85
effect
0.84
Activations Density 0.631%