INDEX
Explanations
mentions of legal cases and political events
New Auto-Interp
Negative Logits
Aut
-0.58
Siber
-0.56
Ball
-0.56
param
-0.54
reconc
-0.54
ank
-0.53
Pil
-0.53
sers
-0.52
ICAL
-0.52
Nav
-0.51
POSITIVE LOGITS
attest
0.86
ought
0.83
fame
0.81
outwe
0.81
outweigh
0.81
ensured
0.78
were
0.77
appeared
0.76
are
0.76
tended
0.76
Activations Density 6.260%