INDEX
Explanations
mentions or references to specific individuals or entities
repeated special characters or symbols in a text
New Auto-Interp
Negative Logits
envy
-0.74
hindsight
-0.73
Belg
-0.67
Guinness
-0.67
obfusc
-0.65
extrap
-0.65
shack
-0.65
Lag
-0.64
disappoint
-0.63
decomp
-0.62
POSITIVE LOGITS
ı
1.58
¬
1.42
º
1.41
ª
1.40
Į
1.38
Ń
1.37
²
1.35
£
1.35
Ĵ
1.31
IJ
1.31
Activations Density 0.149%