INDEX
Explanations
elements related to fundamental objections and arguments against established beliefs or theories
New Auto-Interp
Negative Logits
seamnă
-0.68
nossas
-0.62
our
-0.58
birbir
-0.55
mūsų
-0.55
nossos
-0.53
กัน
-0.52
noastre
-0.51
montanha
-0.51
unsere
-0.50
POSITIVE LOGITS
himself
1.18
himself
0.93
Notae
0.74
herself
0.72
NSCoder
0.67
">//
0.65
Himself
0.65
his
0.62
úrese
0.60
متعلقه
0.58
Activations Density 0.686%