INDEX
Explanations
mentions of positions of authority or seniority
occurrences of the word "senior" and its variations
New Auto-Interp
Negative Logits
pel
-0.95
closed
-0.80
doors
-0.78
tails
-0.77
bones
-0.73
weet
-0.70
bags
-0.70
door
-0.70
apesh
-0.67
last
-0.67
POSITIVE LOGITS
ity
1.22
citizen
0.98
citiz
0.93
lecturer
0.90
advisor
0.87
citizens
0.86
adviser
0.84
thesis
0.79
ities
0.77
fellow
0.76
Activations Density 0.032%