INDEX
Explanations
mentions of something being honorable
terms related to honorable mentions and recognition
New Auto-Interp
Negative Logits
aum
-0.83
reth
-0.82
icion
-0.80
anas
-0.78
etter
-0.77
eters
-0.77
romy
-0.71
Na
-0.71
rait
-0.70
othe
-0.69
POSITIVE LOGITS
nesses
0.93
Magikarp
0.84
orable
0.80
NESS
0.74
applause
0.68
ness
0.67
justice
0.66
Skies
0.63
honorable
0.63
ABLE
0.62
Activations Density 0.024%