INDEX
Explanations
words related to legal issues or consequences
instances of punctuation or breaks in the flow of text
New Auto-Interp
Negative Logits
ngth
-0.65
abouts
-0.59
Availability
-0.59
erie
-0.59
misunder
-0.57
Foundation
-0.56
chin
-0.56
ãĤ£
-0.56
icro
-0.55
ophon
-0.55
POSITIVE LOGITS
lest
1.18
huh
1.08
eh
1.02
etc
1.01
aka
0.91
anyway
0.88
thereby
0.88
or
0.87
preferably
0.86
respectively
0.85
Activations Density 0.509%