INDEX
Explanations
mentions of the word "Prime"
New Auto-Interp
Negative Logits
enegger
-0.75
okia
-0.73
anners
-0.73
ocket
-0.72
kes
-0.72
anwhile
-0.71
rouch
-0.69
Canaver
-0.69
arettes
-0.68
prus
-0.67
POSITIVE LOGITS
knit
1.25
val
1.11
etime
0.80
Rib
0.79
eele
0.76
Directive
0.74
thood
0.74
iaries
0.73
iary
0.67
9999
0.67
Activations Density 0.020%