INDEX
Explanations
mentions of political figures and their involvement or statements regarding political events
New Auto-Interp
Negative Logits
―――――
-0.90
cauſe
-0.86
/**
-0.84
houſe
-0.84
Jefus
-0.82
itſelf
-0.82
Anſ
-0.81
ſte
-0.81
Efq
-0.80
Démographie
-0.80
POSITIVE LOGITS
0.49
0.48
‘
0.47
in
0.46
</b>
0.42
ässe
0.42
'
0.41
i
0.41
being
0.39
her
0.38
Activations Density 0.118%