INDEX
Explanations
descriptions related to sensory characteristics or physical appearance
terms related to disabilities and sensitivity to others' conditions
New Auto-Interp
Negative Logits
ameron
-0.66
Additional
-0.62
lag
-0.61
Wo
-0.61
Ball
-0.60
Sierra
-0.60
Western
-0.59
ggles
-0.58
initions
-0.58
UTION
-0.58
POSITIVE LOGITS
enough
0.96
minded
0.77
retty
0.75
ãĥ¼ãĤ¯
0.67
enough
0.66
asleep
0.65
Ichigo
0.65
urated
0.65
wired
0.65
alright
0.64
Activations Density 0.497%