INDEX
Explanations
mentions of names of people or organizations
proper names, particularly individuals and organizations related to media, politics, and events
New Auto-Interp
Negative Logits
atform
-0.83
FG
-0.70
quarters
-0.67
leneck
-0.65
etheus
-0.64
psychiat
-0.64
incom
-0.63
skelet
-0.62
£ı
-0.61
yright
-0.60
POSITIVE LOGITS
belt
0.81
rome
0.70
uous
0.69
Hera
0.69
oga
0.67
utsche
0.67
cheat
0.67
atchewan
0.66
pants
0.65
pta
0.64
Activations Density 0.460%