INDEX
Explanations
quotes or statements enclosed in special characters (e.g., Ċ) within text
the presence of different formats of citations or references
New Auto-Interp
Negative Logits
derby
-0.69
Thornton
-0.68
lled
-0.67
lifes
-0.66
afar
-0.65
administrative
-0.64
laun
-0.62
relegation
-0.61
brunt
-0.61
drums
-0.61
POSITIVE LOGITS
"â̦
1.03
"...
0.95
Liter
0.92
Dialogue
0.86
""
0.83
"â̦
0.81
Friend
0.81
âĺħ
0.79
Quote
0.79
"...
0.79
Activations Density 0.206%