INDEX
Explanations
names and references to individuals, particularly those involved in political or legal matters
mentions of specific individuals, particularly those named Nadler
New Auto-Interp
Negative Logits
osuke
-0.92
ous
-0.86
ously
-0.82
birds
-0.74
ive
-0.69
draw
-0.68
oin
-0.67
osate
-0.66
utical
-0.65
OUS
-0.65
POSITIVE LOGITS
rament
0.83
aleigh
0.82
acher
0.81
roth
0.80
ablishment
0.75
gew
0.74
zel
0.74
Reich
0.74
ener
0.73
ged
0.72
Activations Density 0.068%