INDEX
Explanations
words related to institutions and management
plan and management
New Auto-Interp
Negative Logits
-0.65
<bos>
-0.64
<eos>
-0.62
↵
-0.54
-
-0.52
I
-0.50
“
-0.49
(
-0.48
The
-0.48
Todavía
-0.47
POSITIVE LOGITS
Monfieur
1.20
Efq
1.19
Shakspeare
1.15
greateſt
1.11
houſe
1.09
Theſe
1.09
themſelves
1.08
myſelf
1.07
ſeveral
1.07
ſtate
1.07
Activations Density 0.634%