INDEX
Explanations
words related to scientific vocabulary and terminology
New Auto-Interp
Negative Logits
itſelf
-1.08
pleaſure
-1.01
myſelf
-1.01
Personensuche
-0.98
Monfieur
-0.96
houſe
-0.96
Jefus
-0.95
themſelves
-0.93
ſever
-0.91
Chriftian
-0.90
POSITIVE LOGITS
last
0.56
hena
0.55
u
0.51
}}_{\0.51
0.51
er
0.50
near
0.50
}}_{0.50
past
0.50
Bel
0.50
Activations Density 0.025%