INDEX
Explanations
names of specific individuals, particularly in political or public roles
references to notable individuals, particularly in political and cultural contexts
New Auto-Interp
Negative Logits
Antar
-0.86
ophon
-0.75
ocard
-0.72
ocre
-0.70
Pearce
-0.68
Avalon
-0.68
utility
-0.66
annis
-0.65
erville
-0.64
Phant
-0.64
POSITIVE LOGITS
DeVos
1.07
etsy
0.90
agall
0.81
enic
0.80
lisher
0.78
lain
0.74
heimer
0.73
ãĥīãĥ©
0.72
cot
0.71
habi
0.71
Activations Density 0.024%