INDEX
Explanations
words related to physical illness or discomfort
occurrences of the word "sick" in various contexts
New Auto-Interp
Negative Logits
unlaw
-0.69
sanctioned
-0.67
Unch
-0.66
principals
-0.65
compr
-0.64
Goodwin
-0.62
wcsstore
-0.61
guid
-0.61
CLS
-0.60
rul
-0.59
POSITIVE LOGITS
ening
1.47
ened
1.31
bay
1.27
er
0.98
nesses
0.96
estro
0.93
ert
0.92
erton
0.92
ly
0.90
ness
0.90
Activations Density 0.024%