INDEX
Explanations
names of people
references to specific individuals or names
New Auto-Interp
Negative Logits
apex
-0.54
Citiz
-0.54
ambassadors
-0.53
comprom
-0.53
reciprocal
-0.52
independents
-0.51
rivals
-0.51
intermedi
-0.50
envy
-0.49
psi
-0.49
POSITIVE LOGITS
Jr
1.00
owski
0.95
chuk
0.92
ovich
0.88
owicz
0.86
iewicz
0.85
ansky
0.85
baum
0.84
(@
0.84
berger
0.83
Activations Density 0.312%