INDEX
Explanations
information or facts presented in a list format
the repeated use of the word "are" in various contexts
New Auto-Interp
Negative Logits
culosis
-0.76
amphetamine
-0.72
iosity
-0.72
odder
-0.70
iture
-0.68
sed
-0.65
hate
-0.64
etheless
-0.62
aired
-0.62
akery
-0.61
POSITIVE LOGITS
screenshots
0.87
examples
0.83
excerpts
0.76
links
0.75
snapshots
0.72
excerpt
0.72
quotes
0.70
tentative
0.69
hoping
0.68
highlights
0.67
Activations Density 0.018%