INDEX
Explanations
medical conditions or health-related terms
mentions of medical or physical conditions
New Auto-Interp
Negative Logits
hots
-0.86
apers
-0.74
angel
-0.69
hers
-0.68
usercontent
-0.67
igr
-0.67
ucks
-0.65
ango
-0.63
rations
-0.63
hey
-0.63
POSITIVE LOGITS
condition
4.06
Condition
2.79
condition
2.76
Condition
2.32
conditions
2.25
Conditions
1.99
conditioning
1.64
conditioned
1.50
circumstance
1.28
situation
1.28
Activations Density 0.019%