INDEX
Explanations
phrases or words referring to or mentioning something specific
phrases that involve references to specific topics or subjects
New Auto-Interp
Negative Logits
laureate
-0.66
/#
-0.65
itte
-0.65
tumblr
-0.62
ãĥīãĥ©
-0.60
ethical
-0.58
unker
-0.58
osate
-0.58
ontent
-0.58
ön
-0.57
POSITIVE LOGITS
uations
0.69
gars
0.66
Nib
0.64
Nieto
0.63
quotations
0.63
Poe
0.61
ename
0.60
enegger
0.60
sarcast
0.60
referring
0.60
Activations Density 0.119%