INDEX
Explanations
names of specific individuals
references to specific individuals or organizations
New Auto-Interp
Negative Logits
ysis
-0.82
agy
-0.77
er
-0.76
usc
-0.75
safe
-0.73
ership
-0.72
hang
-0.72
sticks
-0.70
itiz
-0.70
ocity
-0.70
POSITIVE LOGITS
Fraser
0.88
incinn
0.77
forth
0.75
Opposition
0.72
atoon
0.72
regor
0.69
EStream
0.68
PLA
0.67
clan
0.67
waters
0.66
Activations Density 0.032%