INDEX
Explanations
instances of the phrase "I am."
New Auto-Interp
Negative Logits
EStreamFrame
-0.67
leaflets
-0.64
juven
-0.62
unlaw
-0.62
targ
-0.62
snowball
-0.61
Dunk
-0.61
CODE
-0.59
kinderg
-0.59
destro
-0.58
POSITIVE LOGITS
ethyst
1.25
sterdam
1.16
ateurs
1.14
ateur
1.11
nesty
1.05
azon
1.05
azes
1.01
ajor
0.95
azing
0.93
anda
0.93
Activations Density 0.011%