INDEX
Explanations
terms related to underdogs or characters that struggle with adversity
New Auto-Interp
Negative Logits
SharedCtor
-0.44
际
-0.43
black
-0.42
engu
-0.41
,
-0.39
刻
-0.39
independence
-0.39
rude
-0.39
ecas
-0.39
Ehre
-0.39
POSITIVE LOGITS
Theſe
0.93
softer
0.92
softness
0.91
SOFT
0.90
soft
0.88
softens
0.88
soft
0.87
Soft
0.86
softening
0.85
soften
0.84
Activations Density 0.298%