INDEX
Explanations
characters and symbols used in titles or subtitles
references to specific statistics or structured data within a context
New Auto-Interp
Negative Logits
shred
-0.71
forgetting
-0.70
envy
-0.69
civilisation
-0.69
identifiable
-0.68
litter
-0.67
implant
-0.67
clutter
-0.66
blinding
-0.66
venge
-0.65
POSITIVE LOGITS
MEN
0.99
GOODMAN
0.96
ENG
0.92
ANS
0.91
POL
0.88
VIEW
0.88
Protesters
0.86
GROUND
0.85
VIDE
0.84
Editors
0.83
Activations Density 0.223%