INDEX
Explanations
proper nouns, particularly names of people and characters
New Auto-Interp
Negative Logits
planners
-0.72
Leilan
-0.62
decor
-0.57
Koreans
-0.57
Legislation
-0.56
Russians
-0.56
planner
-0.56
Contracts
-0.55
liberals
-0.55
bureaucrats
-0.55
POSITIVE LOGITS
steen
0.76
enegger
0.73
emonic
0.73
acci
0.72
ideon
0.71
lopp
0.70
estern
0.70
andowski
0.70
buquerque
0.70
eeper
0.69
Activations Density 0.534%