INDEX
Explanations
phrases related to legal proceedings or political figures
references to the name "Roberts" in various contexts
New Auto-Interp
Negative Logits
liest
-0.92
alities
-0.77
lihood
-0.74
ascar
-0.71
riots
-0.69
mble
-0.69
Beir
-0.69
uate
-0.66
uated
-0.66
ality
-0.65
POSITIVE LOGITS
haw
1.08
ullivan
0.87
eln
0.86
onian
0.85
otle
0.82
burgh
0.79
dale
0.77
ç·
0.76
DOM
0.75
eye
0.75
Activations Density 0.054%