INDEX
Explanations
instances of script or code structure indications
New Auto-Interp
Negative Logits
ing
-0.70
pant
-0.62
ledge
-0.61
ше
-0.61
ous
-0.60
y
-0.60
Kron
-0.59
pant
-0.59
Vill
-0.58
belt
-0.58
POSITIVE LOGITS
}}$}
1.79
}))
1.63
})$}
1.63
__':
1.56
]")]
1.54
]$}
1.53
.)}
1.53
)}
1.53
})*/
1.49
}))
1.49
Activations Density 0.020%