INDEX
Explanations
phrases indicating criticism or negative judgment
special characters or empty strings
New Auto-Interp
Negative Logits
Patriarch
-0.69
Nieto
-0.68
Mellon
-0.64
Miko
-0.64
machine
-0.63
Synd
-0.63
Hole
-0.62
fringe
-0.62
xus
-0.62
Leopard
-0.61
POSITIVE LOGITS
rozen
1.39
requently
1.37
ortun
1.36
ortunate
1.33
ocusing
1.32
avour
1.31
amiliar
1.29
ocused
1.28
luent
1.26
requency
1.26
Activations Density 0.049%