INDEX
Explanations
Japanese particles and sentence structure indicators
New Auto-Interp
Negative Logits
Monfieur
-0.93
Jefus
-0.91
houſe
-0.91
itſelf
-0.91
poffible
-0.90
Chriftian
-0.86
reaſon
-0.86
iſt
-0.84
Efq
-0.84
purpoſe
-0.84
POSITIVE LOGITS
を
1.15
을
1.08
みを
1.04
子を
1.03
를
1.00
曲を
0.98
いを
0.98
devamını
0.95
を
0.92
를
0.90
Activations Density 0.022%