INDEX
Explanations
mentions of specific individuals and their actions
possessive forms and references to individuals or entities
New Auto-Interp
Negative Logits
Reviewer
-0.80
hari
-0.80
KT
-0.79
livion
-0.76
Quantity
-0.75
ENN
-0.75
BT
-0.75
Ú
-0.74
xxx
-0.74
>]
-0.73
POSITIVE LOGITS
newest
1.06
biggest
1.06
own
0.96
inability
0.95
youngest
0.94
oldest
0.92
latest
0.92
attorney
0.91
penchant
0.90
greatest
0.89
Activations Density 0.130%