INDEX
Explanations
references to legal and governmental terms or entities
New Auto-Interp
Negative Logits
narci
-0.40
crees
-0.39
Unterhaltung
-0.35
Y
-0.34
z
-0.34
garota
-0.34
zen
-0.34
-0.34
on
-0.34
rest
-0.34
POSITIVE LOGITS
houſe
0.86
Jefus
0.83
Houſe
0.81
ſelf
0.81
myſelf
0.80
himſelf
0.77
Monfieur
0.74
itſelf
0.74
Anſ
0.73
pleaſure
0.73
Activations Density 1.121%