INDEX
Explanations
the letter "t" as a single token
the negative contractions of the verb "to be."
New Auto-Interp
Negative Logits
tremend
-0.80
indo
-0.70
behavi
-0.68
Reviewer
-0.68
SetTextColor
-0.66
Reloaded
-0.64
Palestin
-0.62
çĭ
-0.62
Practices
-0.61
packs
-0.61
POSITIVE LOGITS
itles
1.00
ween
0.97
ional
0.96
ople
0.94
urtles
0.93
otally
0.92
itled
0.92
apest
0.91
roph
0.90
ruly
0.90
Activations Density 0.047%