INDEX
Explanations
references to investigations related to political figures and potential misconduct
New Auto-Interp
Negative Logits
Lay
-0.38
b
-0.34
k
-0.32
private
-0.31
reposo
-0.31
διά
-0.30
beckon
-0.30
private
-0.30
fresh
-0.29
urma
-0.29
POSITIVE LOGITS
pleaſure
0.75
juſ
0.67
myſelf
0.64
betweenstory
0.62
0.61
Anſ
0.60
purpoſe
0.60
ſch
0.59
Manbalar
0.59
preſent
0.59
Activations Density 0.074%