INDEX
Explanations
relevant information related to legal issues and political statements
New Auto-Interp
Negative Logits
fter
-0.82
ggles
-0.77
][/
-0.75
rome
-0.69
gery
-0.64
tale
-0.63
[/
-0.63
iece
-0.63
eah
-0.63
Ud
-0.63
POSITIVE LOGITS
itself
0.80
its
0.72
predecessor
0.69
reserves
0.68
annex
0.68
advert
0.67
neighbour
0.66
divisions
0.65
operational
0.65
20439
0.64
Activations Density 0.375%