INDEX
Explanations
lines or phrases indicating a transition, evolution, or change in a process or situation
New Auto-Interp
Negative Logits
ername
-0.78
paces
-0.74
dad
-0.71
few
-0.71
chool
-0.70
bowl
-0.69
ĸļ
-0.69
rer
-0.68
terms
-0.68
gyn
-0.68
POSITIVE LOGITS
goodies
0.82
misinformation
0.82
sorts
0.82
mayhem
0.77
doom
0.76
activity
0.75
sunshine
0.75
insanity
0.74
fame
0.73
contr
0.73
Activations Density 3.296%