INDEX
Explanations
religious or moral language related to purity and sanctity
adjectives and specific descriptors related to character traits and qualities
New Auto-Interp
Negative Logits
rero
-0.71
Lands
-0.71
Houses
-0.70
Falk
-0.70
Alz
-0.69
Ri
-0.66
Prescott
-0.65
Bever
-0.65
Yard
-0.64
Berk
-0.64
POSITIVE LOGITS
ously
1.09
osity
0.97
istic
0.97
iously
0.96
inous
0.95
ous
0.89
eness
0.88
istically
0.87
atory
0.86
inarily
0.85
Activations Density 0.139%