INDEX
Explanations
terms related to different forms of abuse, such as physical, emotional, and financial abuse
instances of the word "abuse" and its varying contexts
New Auto-Interp
Negative Logits
cil
-0.85
travel
-0.78
izen
-0.73
pard
-0.73
ailed
-0.68
pai
-0.67
views
-0.66
vision
-0.66
cise
-0.65
printed
-0.64
POSITIVE LOGITS
abuse
0.92
abusing
0.84
abuse
0.80
perpetrated
0.80
inflicted
0.77
abused
0.74
victims
0.74
tactics
0.73
abuses
0.72
abusers
0.72
Activations Density 0.024%