INDEX
Explanations
mentions of values related to respect and integrity
New Auto-Interp
Negative Logits
Lans
-0.93
concoct
-0.75
helicop
-0.68
paran
-0.67
dreamed
-0.67
NetMessage
-0.66
otine
-0.65
nerv
-0.65
uzzle
-0.64
enthusi
-0.64
POSITIVE LOGITS
ability
1.47
ably
1.35
fully
1.04
able
1.03
abilities
1.01
ility
0.93
ful
0.91
ibly
0.90
amental
0.88
FUL
0.88
Activations Density 0.028%