INDEX
Explanations
HTML list elements
the HTML list item tags in the document
New Auto-Interp
Negative Logits
lain
-0.78
Debor
-0.74
detail
-0.73
locks
-0.73
Cosponsors
-0.68
rations
-0.66
mble
-0.66
manship
-0.66
rals
-0.65
ilater
-0.65
POSITIVE LOGITS
utenant
1.22
zzle
1.19
ptin
1.18
uci
0.99
pper
0.95
ars
0.92
udic
0.92
veland
0.90
pton
0.88
ppe
0.86
Activations Density 0.007%