INDEX
Explanations
phrases that indicate the presence of items or descriptions associated with other nouns
New Auto-Interp
Negative Logits
myſelf
-1.08
houſe
-0.97
purpoſe
-0.92
Monfieur
-0.92
ſelf
-0.91
iſt
-0.88
Jefus
-0.87
cauſe
-0.86
ſta
-0.86
ſche
-0.85
POSITIVE LOGITS
có
0.66
ที่มี
0.63
with
0.60
mita
0.59
a
0.59
no
0.57
has
0.57
af
0.56
Has
0.56
Sinne
0.56
Activations Density 0.249%