INDEX
Explanations
topics related to health and therapy, particularly involving unconventional treatments and their societal implications
New Auto-Interp
Negative Logits
’).
-1.02
"].
-0.97
”).
-0.95
']").
-0.92
).
-0.90
))$.
-0.89
》.
-0.88
}}$.
-0.88
?).
-0.88
]."
-0.86
POSITIVE LOGITS
,
1.66
;
1.27
but
1.00
because
0.94
whereas
0.84
%,
0.83
$,
0.82
although
0.79
,
0.79
והוא
0.78
Activations Density 3.912%