INDEX
Explanations
mentions of the upper class or high positions
New Auto-Interp
Negative Logits
tnc
-0.74
Machina
-0.74
eln
-0.74
76561
-0.72
ensable
-0.71
printf
-0.70
atile
-0.69
Wikipedia
-0.68
inatory
-0.67
roma
-0.67
POSITIVE LOGITS
most
1.23
lobe
0.89
class
0.80
earners
0.80
jaw
0.79
bounds
0.78
limb
0.77
lip
0.77
peninsula
0.76
tier
0.74
Activations Density 0.024%