INDEX
Explanations
phrases related to illegal or problematic actions and consequences
statements or claims about various scenarios, often using the verb "is" to assert conditions or attributes
New Auto-Interp
Negative Logits
Awakens
-0.66
onday
-0.62
Telesc
-0.62
Came
-0.61
ãĤ©
-0.59
Scroll
-0.58
sburg
-0.57
kson
-0.57
peg
-0.57
Brow
-0.56
POSITIVE LOGITS
prohibited
1.15
commonplace
1.12
often
1.11
usually
1.11
rael
1.11
commonly
1.07
rarely
1.04
outlawed
1.03
frowned
1.02
generally
1.02
Activations Density 0.292%