INDEX
Explanations
contractions of words
repeated instances of the word "I've"
New Auto-Interp
Negative Logits
lapt
-0.74
Skydragon
-0.66
é¾įå¥ij士
-0.61
owl
-0.59
scoreboard
-0.59
jud
-0.58
Reviewer
-0.57
ological
-0.56
keywords
-0.55
separ
-0.55
POSITIVE LOGITS
tta
0.96
mber
0.90
theless
0.87
tt
0.87
Been
0.86
been
0.86
ttes
0.86
tti
0.85
been
0.85
ggies
0.84
Activations Density 0.028%