INDEX
Explanations
comparisons to other entities
references to comparisons between different entities or categories
New Auto-Interp
Negative Logits
steen
-0.70
ãĥĻ
-0.63
aily
-0.59
itone
-0.58
angan
-0.57
attribute
-0.57
=#
-0.56
hower
-0.56
rang
-0.55
instein
-0.55
POSITIVE LOGITS
surveyed
0.81
except
0.80
besides
0.79
vying
0.73
nationwide
0.73
whatsoever
0.71
.'
0.71
due
0.67
'.
0.67
combined
0.67
Activations Density 0.235%