INDEX
Explanations
terms related to medical diagnosis or political outcomes, especially with a focus on negative consequences
terms related to diagnosis and losing situations
New Auto-Interp
Negative Logits
mete
-0.67
link
-0.65
Discord
-0.62
commun
-0.61
Nun
-0.57
gig
-0.57
meteor
-0.57
charm
-0.56
Fam
-0.56
女
-0.56
POSITIVE LOGITS
osing
4.13
oses
2.77
osed
2.62
oser
2.05
ose
2.03
osures
1.80
OSE
1.77
OSED
1.48
osite
1.47
osis
1.43
Activations Density 0.015%