INDEX
Explanations
phrases related to causes or reasons for an event or condition
phrases related to causes of events or conditions
New Auto-Interp
Negative Logits
yss
-0.84
sonian
-0.76
eks
-0.73
nets
-0.72
Laughs
-0.72
Leaks
-0.70
ribes
-0.70
Pixel
-0.69
ellen
-0.69
chens
-0.69
POSITIVE LOGITS
displacement
0.82
blindness
0.78
discrimination
0.78
illness
0.77
hostilities
0.74
miscar
0.72
separation
0.71
racism
0.69
sexism
0.69
death
0.69
Activations Density 0.087%