INDEX
Explanations
phrases related to politics and power dynamics
phrases related to intelligence and critical thinking
New Auto-Interp
Negative Logits
orer
-0.69
TION
-0.68
idae
-0.67
escription
-0.66
ccording
-0.65
Driver
-0.65
payment
-0.64
¤
-0.64
Leod
-0.62
Previous
-0.62
POSITIVE LOGITS
halls
1.88
offices
1.71
corridors
1.70
classrooms
1.63
rooms
1.60
desks
1.53
capitals
1.53
bedrooms
1.50
kitchens
1.48
chambers
1.45
Activations Density 0.572%