INDEX
Explanations
HTML tags
HTML tags and structure elements
New Auto-Interp
Negative Logits
brutally
-0.74
harshly
-0.69
wrongly
-0.68
deval
-0.68
injust
-0.66
ultras
-0.65
76561
-0.65
reperc
-0.65
sounding
-0.64
vulner
-0.63
POSITIVE LOGITS
div
1.05
sv
1.04
select
0.88
TABLE
0.86
template
0.85
figure
0.84
paragraph
0.81
td
0.81
window
0.78
mission
0.78
Activations Density 0.011%