INDEX
Explanations
phrases indicating strong opinions or beliefs
connections to significant moral or ethical themes
New Auto-Interp
Negative Logits
Awakens
-0.56
Realms
-0.55
Masquerade
-0.53
Mak
-0.53
Kenn
-0.52
Sn
-0.52
Kardash
-0.51
Nik
-0.50
zo
-0.50
flyers
-0.50
POSITIVE LOGITS
maxwell
0.65
lie
0.62
onite
0.58
ieu
0.57
etheless
0.57
JECT
0.56
Topic
0.56
hement
0.56
oldown
0.56
ŃĶ
0.56
Activations Density 1.440%