INDEX
Explanations
identifiers or codes associated with entities or subjects
New Auto-Interp
Negative Logits
andise
-1.02
Seym
-0.73
bilt
-0.72
silence
-0.67
Nun
-0.64
ruciating
-0.63
ndum
-0.62
orsi
-0.62
indo
-0.59
Papua
-0.58
POSITIVE LOGITS
irect
1.15
irection
1.03
iots
1.01
ocument
1.00
imensional
0.98
escription
0.92
DEN
0.90
ouble
0.89
itor
0.87
ividual
0.86
Activations Density 0.015%