INDEX
Explanations
words related to properties or statuses, particularly in the context of characteristics or qualities
New Auto-Interp
Negative Logits
myſelf
-1.04
autorytatywna
-0.94
RectangleBorder
-0.92
itſelf
-0.90
themſelves
-0.88
himſelf
-0.88
fubject
-0.87
)}</
-0.86
}</
-0.85
purpoſe
-0.82
POSITIVE LOGITS
Non
0.72
Non
0.71
non
0.70
__
0.64
非
0.61
Sc
0.59
()
0.58
non
0.57
Sc
0.56
met
0.51
Activations Density 0.185%