INDEX
Explanations
meth labs create toxic waste
New Auto-Interp
Negative Logits
im
0.63
en
0.58
Block
0.51
Un
0.50
Jeff
0.50
Ab
0.50
Seal
0.48
Front
0.48
Vor
0.48
ImportGroup
0.48
POSITIVE LOGITS
crossed
0.51
wasn
0.48
drove
0.46
Wouldn
0.45
assisted
0.44
RSV
0.44
0.43
pleases
0.43
nessa
0.42
pretended
0.41
Activations Density 0.004%