INDEX
Explanations
specific names of individuals, particularly in the context of Japanese names
New Auto-Interp
Negative Logits
itſelf
-0.93
purpoſe
-0.91
ſelf
-0.82
houſe
-0.78
ſtand
-0.76
reaſon
-0.76
ſelves
-0.76
myſelf
-0.75
uſe
-0.73
pleaſure
-0.72
POSITIVE LOGITS
Mr
0.61
Mr
0.52
гій
0.52
señor
0.50
senhor
0.49
ringan
0.48
sir
0.47
récents
0.47
Dr
0.46
cappello
0.46
Activations Density 0.093%