INDEX
Explanations
proper nouns related to famous figures or places
large quantities of commas or punctuation, suggesting a focus on lists or multiple points in the text
New Auto-Interp
Negative Logits
oir
-0.60
EO
-0.56
irection
-0.55
rom
-0.54
iku
-0.53
ore
-0.52
ord
-0.52
atively
-0.52
ishly
-0.52
rome
-0.51
POSITIVE LOGITS
respectively
0.79
âĵĺ
0.65
zbollah
0.60
udeb
0.60
usalem
0.58
itsch
0.56
Belfast
0.53
luaj
0.53
Fla
0.52
Sammy
0.50
Activations Density 0.380%