INDEX
Explanations
instances of the letter 'n'
New Auto-Interp
Negative Logits
Lumpur
-0.76
Lama
-0.74
Ferdinand
-0.67
Doctors
-0.64
NEY
-0.63
Drain
-0.62
Vaj
-0.61
boss
-0.61
EntityItem
-0.59
Coun
-0.59
POSITIVE LOGITS
itty
0.93
onda
0.91
umin
0.89
apo
0.88
ano
0.82
anny
0.81
urs
0.79
Drive
0.78
vidia
0.78
idem
0.78
Activations Density 0.144%