INDEX
Explanations
highlights or emphasis in text through special characters or unusual symbols
strong positive descriptors of a person's character
New Auto-Interp
Negative Logits
Targ
-0.65
Wer
-0.61
favourable
-0.58
Wellington
-0.57
Niet
-0.55
Pok
-0.55
snack
-0.55
trumpet
-0.54
Yon
-0.54
SOS
-0.54
POSITIVE LOGITS
Ļ
1.33
ļ
1.12
ª
1.09
ľ
1.09
ĸ
1.09
ħ
1.05
Ĩ
1.04
¤
1.03
Ŀ
1.01
Ń
0.99
Activations Density 0.232%