INDEX
Explanations
the concept or notion being discussed or explored in the text
phrases that express concepts or notions
New Auto-Interp
Negative Logits
Rated
-0.60
ICA
-0.58
KO
-0.56
dos
-0.54
ndum
-0.54
zon
-0.53
Neg
-0.53
ECA
-0.53
ilaterally
-0.52
Law
-0.52
POSITIVE LOGITS
arises
1.00
arose
0.99
behind
0.97
originated
0.92
persists
0.92
derives
0.87
stems
0.87
seems
0.81
horr
0.81
of
0.80
Activations Density 0.148%