INDEX
Explanations
the word "that" followed by a dash and an upward arrow symbol
instances of the character "âĢ"
New Auto-Interp
Negative Logits
Rag
-0.77
ropes
-0.65
Marshal
-0.64
gossip
-0.62
salad
-0.62
curses
-0.62
Jav
-0.61
snacks
-0.61
salads
-0.61
Droid
-0.61
POSITIVE LOGITS
º
1.12
¹
1.04
Ĵ
1.00
£
0.92
®
0.88
ķ
0.88
¼
0.87
ı
0.87
¡
0.87
ates
0.85
Activations Density 0.068%