INDEX
Explanations
specific symbols or characters within the text
phrases related to legal proceedings or news articles mentioning specific individuals
New Auto-Interp
Negative Logits
enei
-0.67
opian
-0.65
Centauri
-0.64
zoo
-0.64
level
-0.64
drain
-0.64
indo
-0.63
infring
-0.63
alley
-0.62
lounge
-0.61
POSITIVE LOGITS
particularly
0.97
whose
0.95
including
0.95
especially
0.92
perhaps
0.92
————
0.91
again
0.89
along
0.87
pictured
0.83
albeit
0.82
Activations Density 0.200%