INDEX
Explanations
statements of fact or specific information
statements indicating the presence or absence of issues or conditions
New Auto-Interp
Negative Logits
fuck
-0.85
stars
-0.79
clich
-0.77
evils
-0.72
Characters
-0.71
tits
-0.69
metaphors
-0.69
sucks
-0.67
fuck
-0.65
desserts
-0.65
POSITIVE LOGITS
currently
1.08
widespread
1.05
speculation
0.97
ongoing
0.97
disagreement
0.95
ample
0.94
stantial
0.93
evidence
0.92
no
0.90
extensive
0.89
Activations Density 0.143%