INDEX
Explanations
names or specific terms that appear in various contexts, potentially related to people, places, or events
New Auto-Interp
Negative Logits
ħĭ
-0.80
ĸļ
-0.72
ACTED
-0.72
enactment
-0.67
exception
-0.67
ĪĴ
-0.63
caucuses
-0.62
consistency
-0.62
scrut
-0.61
depreciation
-0.59
POSITIVE LOGITS
raltar
0.93
apon
0.92
onz
0.91
anan
0.87
reau
0.86
ierrez
0.85
illion
0.85
ppe
0.84
alez
0.83
apons
0.82
Activations Density 0.072%