INDEX
Explanations
comparative phrases and specific examples of concepts
New Auto-Interp
Negative Logits
rê
-0.48
גון
-0.47
cientí
-0.46
raccol
-0.45
např
-0.44
pecies
-0.43
textos
-0.43
cuci
-0.43
like
-0.43
telles
-0.43
POSITIVE LOGITS
Houſe
0.88
houſe
0.86
myſelf
0.81
ſelf
0.81
poffe
0.80
chofe
0.79
ſelves
0.78
itſelf
0.78
Jefus
0.76
raiſ
0.75
Activations Density 0.107%