INDEX
Explanations
words related to explosives or incendiary devices
occurrences of the suffix "ov"
New Auto-Interp
Negative Logits
Ĥª
-0.72
dress
-0.69
cane
-0.66
prevailing
-0.64
prevail
-0.61
prep
-0.60
pigeon
-0.57
giving
-0.55
commod
-0.55
fix
-0.55
POSITIVE LOGITS
ski
1.42
ideo
1.14
irus
1.10
ille
1.04
itz
1.00
rolet
0.99
itch
0.97
sky
0.96
arov
0.95
olt
0.95
Activations Density 0.018%