INDEX
Explanations
references to the word "Good"
occurrences of the word "Good."
New Auto-Interp
Negative Logits
hars
-0.65
pent
-0.65
apse
-0.64
ptin
-0.62
opers
-0.61
ounter
-0.61
isive
-0.61
ãĢij
-0.60
succeeding
-0.60
natureconservancy
-0.59
POSITIVE LOGITS
reads
1.30
bye
1.21
enough
1.20
luck
1.16
Samar
1.05
night
1.03
morning
1.02
friend
1.00
spr
0.98
will
0.97
Activations Density 0.021%